reproducibilityindex.ai

Regularized Gradient Boosting

Authors: Corinna Cortes, Mehryar Mohri, Dmitry Storcheus

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Finally, we provide experimental results, demonstrating that our algorithm achieves signiﬁcantly better out-of-sample performance on multiple datasets than the standard GB algorithm used with its regularization.
Researcher Affiliation	Collaboration	Corinna Cortes Google Research New York, NY 10011 corinna@google.com Mehryar Mohri Google & Courant Institute New York, NY 10012 mohri@google.com Dmitry Storcheus Courant Institute & Google New York, NY 10012 dstorcheus@google.com
Pseudocode	Yes	Algorithm 1 RGB. Input: α = 0, F = 0 1: for t [1, T] do 2: [t1, , t S] P 3: for s [1, S] do 4: hs argminh Hts 1 m Pm i=1 Φ yi, F 1 Cts L ts(α)h 5: end for 6: s = argmins [1,S] 1 m Pm i=1 Φ yi, F 1 Cts L ts(α)hs + βΩ(hts) 7: α α 1 Cs L s (α)ets 8: F F 1 Cs L s (α)hs 9: end for
Open Source Code	No	No explicit statement or link providing access to the open-source code for the methodology described in this paper was found.
Open Datasets	Yes	Table 1 shows the classiﬁcation errors on the test sets for the UCI datasets studied, for both RGB and GB, see Table 2 in the appendix for details on the dataset. Table 2: Dataset Statistics. Dataset #Features #Train #Test sonar [UCI] 60 104 104 cancer [UCI] 9 342 227 diabetes [UCI] 8 468 300 ocr17 [LIBSVM] 256 1686 422 ocr49 [LIBSVM] 256 1686 422 mnist17 [LIBSVM] 780 12665 3167 mnist49 [LIBSVM] 780 12665 3167 higgs [UCI] 28 88168 22042
Dataset Splits	Yes	The hyperparameters are chosen via 5-fold cross-validation, and the standard errors for the best set of hyperparameters reported.
Hardware Specification	No	No specific hardware details (GPU/CPU models, memory, or cloud instance types) used for running the experiments were provided.
Software Dependencies	No	The paper mentions using "the XGBOOST library" but does not provide specific version numbers for any software dependencies.
Experiment Setup	Yes	For a given training sample, we normalize the regularization Ω(h) to be in [0, 1] and tune the RGB parameter β using a grid search over β {0.001, 0.01, 0.1, 0.3, 1}. We use the logistic loss as the per-instance loss Φ. For the complexity of these base classiﬁers we use the bound derived in Theorem 1. To deﬁne the subfamilies of base learners we impose a grid of size 7 on the maximum number of internal nodes n {2, 4, 8, 16, 32, 64, 256} and a grid of size 7 on λ {0.001, 0.01, 0.1, 0.5, 1, 2, 4}. Both GB and RGB are run for T = 100 boosting rounds. The hyperparameters are chosen via 5-fold cross-validation, and the standard errors for the best set of hyperparameters reported. Speciﬁcally, we let the ℓ2 norm regularization parameter be in {0.001, 0.01, 0.1, 0.5, 1, 2, 4}, the maximum tree depth parameter in {1, 2, 3, 4, 5, 6, 7}, and the learning rate parameter in {0.001, 0.01, 0.1, 0.5, 1}.