reproducibilityindex.ai

Sample-Efficient Agnostic Boosting

Authors: Udaya Ghai, Karan Singh

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In preliminary experiments in Section 7, we demonstrate that the sample complexity improvements of our approach do indeed manifest in the form of improved empirical performance over previous agnostic boosting approaches on commonly used datasets. 7 Experiments In Table 3, we report the results of preliminary experiments with Algorithm 1 against the agnostic boosting algorithms in [KK09] and [BCHM20] as baselines on UCI classification datasets [SWHB89, HRFS99, SED+88], using decision stumps [PVG+11] as weak learners.
Researcher Affiliation	Collaboration	Udaya Ghai Amazon ughai@amazon.com Karan Singh Tepper School of Business Carnegie Mellon University karansingh@cmu.edu
Pseudocode	Yes	Algorithm 1 Agnostic Boosting via Sample Reuse
Open Source Code	No	The paper describes implementation details in Appendix J, but does not provide a direct link to open-source code or an explicit statement of code availability for the described methodology.
Open Datasets	Yes	In Table 3, we report the results of preliminary experiments with Algorithm 1 against the agnostic boosting algorithms in [KK09] and [BCHM20] as baselines on UCI classification datasets [SWHB89, HRFS99, SED+88], using decision stumps [PVG+11] as weak learners.
Dataset Splits	Yes	Accuracy is estimated using 30-fold cross validation with a grid search over the mixing weight σ and the number of boosters T.
Hardware Specification	Yes	The runtime used for all experiments was a Google Cloud Engine VM instance with 2 v CPUs (Intel Xeon 64-bit @ 2.20 GHz) and 12 GB RAM.
Software Dependencies	No	The paper mentions using "decision stumps [PVG+11]" and referencing "Scikit-learn: Machine learning in Python" in the citation, but does not specify version numbers for any software dependencies.
Experiment Setup	Yes	For all algorithms, we perform a grid search on the number of boosting rounds with T {25, 50, 100}. For Algorithm 1 we search over σ {0.1, 0.25, 0.5} as well. Rather than using a fixed η, our implementation uses an adaptive step-size scheme proportional to the empirical correlation on the current relabeled distribution.