Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Multivariate Boosted Trees and Applications to Forecasting and Control

Authors: Lorenzo Nespoli, Vasco Medici

JMLR 2022 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we present numerical results of the responses and loss functions introduced in section 3. For all the datasets, we obtained the results using k-fold cross-validation (CV). Since all the applications deal with temporal data, we adopted sliding-window cross-validation.
Researcher Affiliation Collaboration Lorenzo Nespoli1,2 EMAIL Vasco Medici1 EMAIL 1ISAAC, SUPSI, Mendrisio, CH, 2Hive Power SA, Manno, CH
Pseudocode Yes Algorithm 1 describes the boosting procedure: starting from an initial guess for ˆy, which in this case corresponds to the column-expectations of y, we retrieve the gradient g and hessian matrices h for all the observations of the dataset (line 3), given the loss function L and the leaf response function r. At line 4 the weak learner at iteration k is fitted using the fit-tree algorithm described in 2.
Open Source Code Yes The algorithm has been released as a python package under MIT license, and it is freely available at https://github.com/supsi-dacd-isaac/mbtr. All the code used for running the experiments presented in the paper, including the code for generating the figures, is available at https://github.com/supsi-dacd-isaac/mbtr_experiments.
Open Datasets Yes All the used datasets are freely accessible, and directly downloaded by the experiment s code. The dataset used for the numerical experiments can be downloaded from https://zenodo.org/record/ 4108561#.YEeuk Vm YWV5 and https://zenodo.org/record/4549296#.YEeuv Fm YWV4.
Dataset Splits Yes For all the datasets, we obtained the results using k-fold cross-validation (CV). Since all the applications deal with temporal data, we adopted sliding-window cross-validation. An example of training and testing splits under this cross-validation is shown in Fig. 2, in the case of 3 folds.
Hardware Specification No No specific hardware details (like GPU/CPU models, processor types, or memory amounts) are mentioned in the paper for running the experiments.
Software Dependencies No The paper mentions software like 'python package', 'Light GBM and XGBoost', 'PVlib python library', 'Open DSS', and 'tsutils R package' but does not specify their version numbers.
Experiment Setup Yes In all the experiments the hyperparameters were fixed to the following values, in order to guarantee a fair comparison with the Light GBM regressors. For all the experiments, we kept the Light GBM s number of iterations fixed to 100 and a learning rate of 0.1, as for the MBT models. Table 4 shows the most important parameters for the different experiments carried out in the paper.