NGBoost: Natural Gradient Boosting for Probabilistic Prediction
Authors: Tony Duan, Avati Anand, Daisy Yi Ding, Khanh K. Thai, Sanjay Basu, Andrew Ng, Alejandro Schuler
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments use datasets from the UCI Machine Learning Repository, and follow the protocol first proposed in Hern andez-Lobato and Adams (2015). For all datasets, we hold out a random 10% of the examples as a test set. From the other 90% we initially hold out 20% as a validation set to select M (the number of boosting stages) that gives the best log-likelihood, and then retrain on the entire 90% using the chosen M. The retrained model is then made to predict on the held-out 10% test set. This entire process is repeated 20 times for all datasets except Protein and Year MSD, for which it is repeated 5 times and 1 time respectively. |
| Researcher Affiliation | Collaboration | 1Stanford University, Stanford, California, United States 2Unlearn.ai, San Francisco, California, United States 3Harvard Medical School, Cambridge, Massachusetts, United States. |
| Pseudocode | Yes | Algorithm 1 NGBoost for probabilistic prediction |
| Open Source Code | Yes | An open-source implementation is available at github.com/stanfordmlgroup/ngboost. |
| Open Datasets | Yes | Our experiments use datasets from the UCI Machine Learning Repository, and follow the protocol first proposed in Hern andez-Lobato and Adams (2015). |
| Dataset Splits | Yes | For all datasets, we hold out a random 10% of the examples as a test set. From the other 90% we initially hold out 20% as a validation set to select M (the number of boosting stages) that gives the best log-likelihood, and then retrain on the entire 90% using the chosen M. |
| Hardware Specification | No | The paper discusses computational aspects like mini-batching and scalability to large datasets, but it does not provide specific details on the hardware used, such as GPU/CPU models, memory, or cloud instance types. |
| Software Dependencies | No | The paper mentions using 'Scikit-Learn implementation' for comparison methods but does not provide specific version numbers for Scikit-Learn or any other software dependencies. |
| Experiment Setup | Yes | For all experiments, NGBoost was configured with the Normal distribution, decision tree base learner with a maximum depth of three levels, and log scoring rule. The Year MSD dataset, being extremely large relative to the rest, was fit using a learning rate η of 0.1 while the rest of the datasets were fit with a learning rate of 0.01. In general we recommend small learning rates, subject to computational feasibility. For the Year MSD dataset we use a mini-batch size of 10%, for all other datasets we use 100%. |