reproducibilityindex.ai

Generalized Boosting

Authors: Arun Suggala, Bingbin Liu, Pradeep Ravikumar

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Using thorough empirical evaluation, we show that our learning algorithms have superior performance over traditional additive boosting algorithms, as well as existing greedy learning techniques for DNNs.
Researcher Affiliation	Academia	Arun Sai Suggala, Bingbin Liu, Pradeep Ravikumar Carnegie Mellon University Pittsburgh, PA 15213 {asuggala,bingbinl,pradeepr}@cs.cmu.edu
Pseudocode	Yes	Algorithm 1 Generalized Boosting ... Algorithm 2 Exact Greedy Update ... Algorithm 3 Gradient Greedy Update
Open Source Code	No	The paper does not provide any explicit statement about releasing source code or a link to a code repository.
Open Datasets	Yes	In this section, we compare various techniques on the following image datasets: CIFAR10, MNIST, Fashion MNIST [35], MNIST-rot-back-image [24], convex [35], SVHN [28], and the following tabular datasets from UCI repository [7]: letter recognition [17], forest cover type (covtype), connect4.
Dataset Splits	Yes	We used hold-out set validation to pick the best hyper-parameters for all the methods. We used 20% of the training data as validation data and picked the best parameters using grid search, based on validation accuracy.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., CPU, GPU models, or memory specifications) used for running the experiments.
Software Dependencies	No	The paper mentions general software components and optimizers like "XGBoost", "Ada Boost", and "SGD", but does not specify any version numbers for these or other key software dependencies.
Experiment Setup	Yes	We used hold-out set validation to pick the best hyper-parameters for all the methods. We used 20% of the training data as validation data and picked the best parameters using grid search, based on validation accuracy. After picking the best parameters, we train on the entire training data and report performance on the test data. For all the greedy techniques based on neural networks, we used fully connected blocks and tuned the following parameters: weight decay, width of weak feature transformers, number of boosting iterations T, which we upper bound by 15. For Cmplx Comp Boost, we set D0{5. For end-to-end training, we tuned weight decay, width of layers, depth. We used SGD for optimization of all these techniques. The number of epochs and step size schedule of SGD are chosen to ensure convergence. For XGBoost, we tuned the number of trees, depth of each tree, learning rate. The exact values of hyper-parameters tuned for each of the methods can be found in Appendix J.