Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Multivariate Boosted Trees and Applications to Forecasting and Control

Authors: Lorenzo Nespoli, Vasco Medici

JMLR 2022 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we present numerical results of the responses and loss functions introduced in section 3. For all the datasets, we obtained the results using k-fold cross-validation (CV). Since all the applications deal with temporal data, we adopted sliding-window cross-validation.
Researcher Affiliation	Collaboration	Lorenzo Nespoli1,2 EMAIL Vasco Medici1 EMAIL 1ISAAC, SUPSI, Mendrisio, CH, 2Hive Power SA, Manno, CH
Pseudocode	Yes	Algorithm 1 describes the boosting procedure: starting from an initial guess for ˆy, which in this case corresponds to the column-expectations of y, we retrieve the gradient g and hessian matrices h for all the observations of the dataset (line 3), given the loss function L and the leaf response function r. At line 4 the weak learner at iteration k is ﬁtted using the fit-tree algorithm described in 2.
Open Source Code	Yes	The algorithm has been released as a python package under MIT license, and it is freely available at https://github.com/supsi-dacd-isaac/mbtr. All the code used for running the experiments presented in the paper, including the code for generating the ﬁgures, is available at https://github.com/supsi-dacd-isaac/mbtr_experiments.
Open Datasets	Yes	All the used datasets are freely accessible, and directly downloaded by the experiment s code. The dataset used for the numerical experiments can be downloaded from https://zenodo.org/record/ 4108561#.YEeuk Vm YWV5 and https://zenodo.org/record/4549296#.YEeuv Fm YWV4.
Dataset Splits	Yes	For all the datasets, we obtained the results using k-fold cross-validation (CV). Since all the applications deal with temporal data, we adopted sliding-window cross-validation. An example of training and testing splits under this cross-validation is shown in Fig. 2, in the case of 3 folds.
Hardware Specification	No	No specific hardware details (like GPU/CPU models, processor types, or memory amounts) are mentioned in the paper for running the experiments.
Software Dependencies	No	The paper mentions software like 'python package', 'Light GBM and XGBoost', 'PVlib python library', 'Open DSS', and 'tsutils R package' but does not specify their version numbers.
Experiment Setup	Yes	In all the experiments the hyperparameters were ﬁxed to the following values, in order to guarantee a fair comparison with the Light GBM regressors. For all the experiments, we kept the Light GBM s number of iterations ﬁxed to 100 and a learning rate of 0.1, as for the MBT models. Table 4 shows the most important parameters for the diﬀerent experiments carried out in the paper.