reproducibilityindex.ai

Optimal Minimal Margin Maximization with Boosting

Authors: Alexander Mathiasen, Kasper Green Larsen, Allan Grønlund

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section we demonstrate how our sparsiﬁcation algorithm can be combined with Gradient Boosting. For simplicity we consider a single dataset in this section, the Flight Delay dataset (air), see the full version (Grønlund et al., 2019) for similar results on other dataset. We train a classiﬁer with T = 500 hypotheses using Light GBM (Ke et al., 2017) which we sparsify using Theorem 2.2 to have T = 80 hypotheses. The sparsiﬁed classiﬁer is guaranteed to preserve all margins of the original classiﬁer to an additive O(lg(n/T )/T ). The cumulative margins of the sparsiﬁed classiﬁer and the original classiﬁer are depicted in Figure 1. Furthermore, we also depict the cumulative margins of a Light GBM classiﬁer trained to have T = 80 hypotheses. ... To investigate this, we performed additional experiments computing AUC and classiﬁcation accuracy of several sparsiﬁed classiﬁers and Light GBM classiﬁers on a test set...The experiments indeed show that the sparsiﬁed classiﬁers outperform the Light GBM classiﬁers with the same number of hypotheses. See Figure 2 for test AUC and test classiﬁcation accuracy.
Researcher Affiliation	Academia	Allan Grønlund * 1 Kasper Green Larsen * 1 Alexander Mathiasen * 1 1Department of Computer Science, University of Aarhus, Denmark.
Pseudocode	Yes	Algorithm 1 Sparsi Boost Input: Training data D = {(xi, yi)}n i=1 where xi ∈ X for some input space X and yi ∈ {−1, +1}. Target number of hypotheses T and base learning algorithm A. Output: Hypotheses h1, . . . , hk and weights w1, . . . , wk with k ≤ T, such that P i wihi has gap O(p lg(2 + n/T)/T) on D. 1. Run Ada Boost V with base learning algorithm A on training data D to get c · T hypotheses h1, . . . , hc·T and weights w1, . . . , wc·T for the integer c = dl g(n)/ lg(2 + n/T)e . 2. Construct margin matrix U ∈ [−1, +1]n×c·T where uij = yihj(xi). 3. Form the vector w with i th coordinate wi and normalize w ← w/kwk1 so kwk1 = 1. 4. Find w˜ such that kw˜ k0 ≤ T, kw˜ k1 = 1 and kUw − Uw˜k∞ = O(p lg(2 + n/T)/T). 5. Let π(j) denote the index of the j th non-zero entry of w˜ . 6. Return hypotheses hπ(1), . . . , hπ(kw˜ k0) with weights w˜π(1), . . . , w˜π(kw˜ k0).
Open Source Code	Yes	A Python implementation of Theorem 2.2 can be found at: https://github.com/AlgoAU/DiscMin
Open Datasets	Yes	For simplicity we consider a single dataset in this section, the Flight Delay dataset (air)... Flight delay data. https://github.com/szilard/benchm-ml#data. ... Inspired by the experiments in (Wang et al., 2008; Ke et al., 2017; Chen & Guestrin, 2016) we also performed the above experiments on the Higgs (Whiteson, 2014) and Letter (Dheeru & Karra Taniskidou, 2017) datasets.
Dataset Splits	No	The paper mentions using a 'test set' for evaluation but does not specify the dataset split percentages, sample counts, or whether a validation set was used, nor does it cite a predefined split for reproducibility.
Hardware Specification	No	The paper does not provide specific hardware details such as CPU/GPU models, memory, or cloud instance types used for the experiments.
Software Dependencies	No	The paper mentions 'Light GBM' and 'Python implementation' but does not specify exact version numbers for these or any other software dependencies required for reproducibility.
Experiment Setup	Yes	We train a classiﬁer with T = 500 hypotheses using Light GBM (Ke et al., 2017) which we sparsify using Theorem 2.2 to have T = 80 hypotheses. ... The plot depicts test AUC and test classiﬁcation accuracy of a Light GBM classiﬁer during training as the number of hypotheses increase (in blue).