Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Boulevard: Regularized Stochastic Gradient Boosted Trees and Their Limiting Distribution
Authors: Yichen Zhou, Giles Hooker
JMLR 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | A simulation study and real world examples provide support for both the predictive accuracy of the model and its limiting behavior. Keywords: gradient boosting, regression tree, regularization, limiting distribution. ... We have conducted an empirical study to demonstrate the performance of Boulevard. |
| Researcher Affiliation | Academia | Yichen Zhou EMAIL Department of Statistics and Data Science Cornell University Ithaca, NY 14853, USA. Giles Hooker EMAIL Department of Statistics University of California, Berkeley Berkeley, CA 94720, USA. |
| Pseudocode | Yes | Algorithm 1 (Boulevard). ... Algorithm 2 (Trees for Non-adaptive Boosting). ... Algorithm 3 (Tail Snapshot Boulevard). |
| Open Source Code | Yes | The empirical study code is provided at: https://github.com/siriuz42/boulevard.git |
| Open Datasets | Yes | Results on four real world data sets selected from UCI Machine Learning Repository (Dheeru and Karra Taniskidou, 2017; T ufekci, 2014; Kaya et al., 2012) are shown in Figure 4. |
| Dataset Splits | Yes | All curves are averages after 5-fold cross validation. ... Figure 8 shows the result when we generate the 90% reproduction intervals for two real world datasets from UCI, namely CCPP and CASP. For each dataset, we take the ๏ฌrst 10 examples as test examples, and split the rest of the dataset into 11 folds. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware used to run its experiments. It mentions 'simulation study' and 'empirical study' but no details on CPU, GPU, or other computing resources. |
| Software Dependencies | No | The paper does not provide specific version numbers for software dependencies or libraries used in the experiments. While it provides a link to code, the specific versions are not detailed within the paper itself. |
| Experiment Setup | Yes | Table 1: Parameters used in empirical study. label n ฮธ ntree k ฮป MSE-(1-4) 5000 0.3 1000 20 0.8 MSE-Boston 506 0.8 1000 5 0.8 MSE-CCPP 9568 0.5 1000 50 0.8 MSE-CASP 20000 0.5 1000 50 0.8 MSE-Airfoil 1503 0.8 1000 40 0.8 Limiting-(1-4) 1000 0.8 2000 10 0.5 Variance-(1-4) 5000 0.8 3000 20 0.5 RI-(1-2) 1000 0.8 2000 10 0.5 RI-(3-4) 5000 0.8 2000 10 0.5 |