Margins are Insufficient for Explaining Gradient Boosting

Authors: Allan Grønlund, Lior Kamma, Kasper Green Larsen

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this work, we first demonstrate that the k th margin bound is inadequate in explaining the performance of state-of-the-art gradient boosters. We then explain the short comings of the k th margin bound and prove a stronger and more refined margin-based generalization bound for boosted classifiers that indeed succeeds in explaining the performance of modern gradient boosters. Finally, we improve upon the recent generalization lower bound by Grønlund et al. (2019). ... First, we present experiments showing that the classic margin bounds alone fail to explain the performance of state-of-the-art gradient boosting algorithms. ... Finally, we run experiments demonstrating that our refined generalization bounds in fact succeed in explaining and predicting the performance of boosting algorithms.
Researcher Affiliation Academia All authors contributed equally, and are presented in alphabetical order. Department of Computer Science, Aarhus University, {jallan,lior.kamma,larsen}@cs.au.dk
Pseudocode No The paper focuses on theoretical proofs and experimental results, but does not include any pseudocode or clearly labeled algorithm blocks.
Open Source Code No The paper does not provide any statements about releasing open-source code for the methodology described, nor does it provide a link to such code.
Open Datasets Yes Table 2: Data sets, all freely available, and parameters considered in the experiments. ... Diabetes, Boone, Forest Cover, Higgs
Dataset Splits Yes For each experiment we randomly split the data set in half to get a training set and a test set of equal size. For the Higgs dataset of size 11 million, we sample a subset of 2 million data points that we randomly split evenly into train and test set.
Hardware Specification No The paper does not specify any hardware details (e.g., GPU/CPU models, memory, cloud instances) used for running the experiments.
Software Dependencies No The paper mentions software like Light GBM and XGBoost as part of the boosting algorithms studied, but it does not specify version numbers for these or any other ancillary software dependencies (e.g., programming languages, libraries) used in their experimental setup.
Experiment Setup Yes For all experiments we only change the tree size and learning rate of the Light GBM hyperparameters. For Ada Boost we allow the same tree size, unlimited depth, as well as forcing a minimum number of elements in each tree learner to be 20 as is default in Light GBM. ... Table 2: Data Set, Tree Size, LR, Runs