Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Multivariate Boosted Trees and Applications to Forecasting and Control
Authors: Lorenzo Nespoli, Vasco Medici
JMLR 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we present numerical results of the responses and loss functions introduced in section 3. For all the datasets, we obtained the results using k-fold cross-validation (CV). Since all the applications deal with temporal data, we adopted sliding-window cross-validation. |
| Researcher Affiliation | Collaboration | Lorenzo Nespoli1,2 EMAIL Vasco Medici1 EMAIL 1ISAAC, SUPSI, Mendrisio, CH, 2Hive Power SA, Manno, CH |
| Pseudocode | Yes | Algorithm 1 describes the boosting procedure: starting from an initial guess for ˆy, which in this case corresponds to the column-expectations of y, we retrieve the gradient g and hessian matrices h for all the observations of the dataset (line 3), given the loss function L and the leaf response function r. At line 4 the weak learner at iteration k is fitted using the fit-tree algorithm described in 2. |
| Open Source Code | Yes | The algorithm has been released as a python package under MIT license, and it is freely available at https://github.com/supsi-dacd-isaac/mbtr. All the code used for running the experiments presented in the paper, including the code for generating the figures, is available at https://github.com/supsi-dacd-isaac/mbtr_experiments. |
| Open Datasets | Yes | All the used datasets are freely accessible, and directly downloaded by the experiment s code. The dataset used for the numerical experiments can be downloaded from https://zenodo.org/record/ 4108561#.YEeuk Vm YWV5 and https://zenodo.org/record/4549296#.YEeuv Fm YWV4. |
| Dataset Splits | Yes | For all the datasets, we obtained the results using k-fold cross-validation (CV). Since all the applications deal with temporal data, we adopted sliding-window cross-validation. An example of training and testing splits under this cross-validation is shown in Fig. 2, in the case of 3 folds. |
| Hardware Specification | No | No specific hardware details (like GPU/CPU models, processor types, or memory amounts) are mentioned in the paper for running the experiments. |
| Software Dependencies | No | The paper mentions software like 'python package', 'Light GBM and XGBoost', 'PVlib python library', 'Open DSS', and 'tsutils R package' but does not specify their version numbers. |
| Experiment Setup | Yes | In all the experiments the hyperparameters were fixed to the following values, in order to guarantee a fair comparison with the Light GBM regressors. For all the experiments, we kept the Light GBM s number of iterations fixed to 100 and a learning rate of 0.1, as for the MBT models. Table 4 shows the most important parameters for the different experiments carried out in the paper. |