Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

High-Dimensional L2-Boosting: Rate of Convergence

Authors: Ye Luo, Martin Spindler, Jannis Kueck

JMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, we present simulation studies to illustrate the relevance of our theoretical results and to provide insights into the practical aspects of boosting. In these simulation studies, L2-Boosting clearly outperforms Lasso.
Researcher Affiliation Academia Ye Luo EMAIL Hong Kong University Business School The University of Hong Kong Hong Kong; Martin Spindler EMAIL Institute for Statistics University of Hamburg Germany; Jannis Kueck EMAIL D usseldorf Institute for Competition Economics Heinrich Heine University D usseldorf Germany
Pseudocode Yes Algorithm 1 (L2-Boosting (BA/PGA)); Algorithm 2 (restricted L2-Boosting Algorithm (res BA/res PGA)); Algorithm 3 (Orthogonal L2-Boosting (o BA))
Open Source Code No The boosting procedures were implemented by the authors and the code is available upon request.
Open Datasets Yes The data set has been provided by DSM (Kaiserburg, Switzerland) and was made publicly available for academic research in B uhlmann et al. (2014) (Supplemental Material).
Dataset Splits Yes The size of the training set was 60 and the remaining 11 observations were used for forecasting.
Hardware Specification No The text does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies Yes The simulations were performed in R (R Core Team (2014)). For Lasso estimation the packages hdm by Chernozhukov et al. (2015) and glmnet by Jerome Friedman (2010) (for cross-validation) were used.
Experiment Setup Yes The goal of this section is to give an illustration of the different stopping criteria. We employ the following data generating process (dgp): y = 5x1 + 2x2 + 1x3 + 0x4 + . . . + 0x10 + ε, (6) where ε N(0, 22) and X = (X1, . . . , X10) N10(0, I10) with I10 denoting the identity matrix of size 10 10.