Margins are Insufficient for Explaining Gradient Boosting
Authors: Allan Grønlund, Lior Kamma, Kasper Green Larsen
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this work, we first demonstrate that the k th margin bound is inadequate in explaining the performance of state-of-the-art gradient boosters. We then explain the short comings of the k th margin bound and prove a stronger and more refined margin-based generalization bound for boosted classifiers that indeed succeeds in explaining the performance of modern gradient boosters. Finally, we improve upon the recent generalization lower bound by Grønlund et al. (2019). ... First, we present experiments showing that the classic margin bounds alone fail to explain the performance of state-of-the-art gradient boosting algorithms. ... Finally, we run experiments demonstrating that our refined generalization bounds in fact succeed in explaining and predicting the performance of boosting algorithms. |
| Researcher Affiliation | Academia | All authors contributed equally, and are presented in alphabetical order. Department of Computer Science, Aarhus University, {jallan,lior.kamma,larsen}@cs.au.dk |
| Pseudocode | No | The paper focuses on theoretical proofs and experimental results, but does not include any pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | No | The paper does not provide any statements about releasing open-source code for the methodology described, nor does it provide a link to such code. |
| Open Datasets | Yes | Table 2: Data sets, all freely available, and parameters considered in the experiments. ... Diabetes, Boone, Forest Cover, Higgs |
| Dataset Splits | Yes | For each experiment we randomly split the data set in half to get a training set and a test set of equal size. For the Higgs dataset of size 11 million, we sample a subset of 2 million data points that we randomly split evenly into train and test set. |
| Hardware Specification | No | The paper does not specify any hardware details (e.g., GPU/CPU models, memory, cloud instances) used for running the experiments. |
| Software Dependencies | No | The paper mentions software like Light GBM and XGBoost as part of the boosting algorithms studied, but it does not specify version numbers for these or any other ancillary software dependencies (e.g., programming languages, libraries) used in their experimental setup. |
| Experiment Setup | Yes | For all experiments we only change the tree size and learning rate of the Light GBM hyperparameters. For Ada Boost we allow the same tree size, unlimited depth, as well as forcing a minimum number of elements in each tree learner to be 20 as is default in Light GBM. ... Table 2: Data Set, Tree Size, LR, Runs |