Regularized Gradient Boosting
Authors: Corinna Cortes, Mehryar Mohri, Dmitry Storcheus
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we provide experimental results, demonstrating that our algorithm achieves significantly better out-of-sample performance on multiple datasets than the standard GB algorithm used with its regularization. |
| Researcher Affiliation | Collaboration | Corinna Cortes Google Research New York, NY 10011 corinna@google.com Mehryar Mohri Google & Courant Institute New York, NY 10012 mohri@google.com Dmitry Storcheus Courant Institute & Google New York, NY 10012 dstorcheus@google.com |
| Pseudocode | Yes | Algorithm 1 RGB. Input: α = 0, F = 0 1: for t [1, T] do 2: [t1, , t S] P 3: for s [1, S] do 4: hs argminh Hts 1 m Pm i=1 Φ yi, F 1 Cts L ts(α)h 5: end for 6: s = argmins [1,S] 1 m Pm i=1 Φ yi, F 1 Cts L ts(α)hs + βΩ(hts) 7: α α 1 Cs L s (α)ets 8: F F 1 Cs L s (α)hs 9: end for |
| Open Source Code | No | No explicit statement or link providing access to the open-source code for the methodology described in this paper was found. |
| Open Datasets | Yes | Table 1 shows the classification errors on the test sets for the UCI datasets studied, for both RGB and GB, see Table 2 in the appendix for details on the dataset. Table 2: Dataset Statistics. Dataset #Features #Train #Test sonar [UCI] 60 104 104 cancer [UCI] 9 342 227 diabetes [UCI] 8 468 300 ocr17 [LIBSVM] 256 1686 422 ocr49 [LIBSVM] 256 1686 422 mnist17 [LIBSVM] 780 12665 3167 mnist49 [LIBSVM] 780 12665 3167 higgs [UCI] 28 88168 22042 |
| Dataset Splits | Yes | The hyperparameters are chosen via 5-fold cross-validation, and the standard errors for the best set of hyperparameters reported. |
| Hardware Specification | No | No specific hardware details (GPU/CPU models, memory, or cloud instance types) used for running the experiments were provided. |
| Software Dependencies | No | The paper mentions using "the XGBOOST library" but does not provide specific version numbers for any software dependencies. |
| Experiment Setup | Yes | For a given training sample, we normalize the regularization Ω(h) to be in [0, 1] and tune the RGB parameter β using a grid search over β {0.001, 0.01, 0.1, 0.3, 1}. We use the logistic loss as the per-instance loss Φ. For the complexity of these base classifiers we use the bound derived in Theorem 1. To define the subfamilies of base learners we impose a grid of size 7 on the maximum number of internal nodes n {2, 4, 8, 16, 32, 64, 256} and a grid of size 7 on λ {0.001, 0.01, 0.1, 0.5, 1, 2, 4}. Both GB and RGB are run for T = 100 boosting rounds. The hyperparameters are chosen via 5-fold cross-validation, and the standard errors for the best set of hyperparameters reported. Specifically, we let the ℓ2 norm regularization parameter be in {0.001, 0.01, 0.1, 0.5, 1, 2, 4}, the maximum tree depth parameter in {1, 2, 3, 4, 5, 6, 7}, and the learning rate parameter in {0.001, 0.01, 0.1, 0.5, 1}. |