Quantized Training of Gradient Boosting Decision Trees
Authors: Yu Shi, Guolin Ke, Zhuoming Chen, Shuxin Zheng, Tie-Yan Liu
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Surprisingly, both our theoretical analysis and empirical studies show that the necessary precisions of gradients without hurting any performance can be quite low, e.g., 2 or 3 bits. Benchmarked on CPUs, GPUs, and distributed clusters, we observe up to 2 speedup of our simple quantization strategy compared with SOTA GBDT systems on extensive datasets, demonstrating the effectiveness and potential of the low-precision training of GBDT. |
| Researcher Affiliation | Collaboration | 1Microsoft Research 2DP Technology 3Tsinghua University |
| Pseudocode | Yes | Algorithm 1 Histogram Construction for Leaf s |
| Open Source Code | No | The code will be released to the official repository of Light GBM.4 |
| Open Datasets | Yes | Table 1: Datasets used in experiments. Name #Train #Test #Attribute Task Metric and footnotes like 5https://www.csie.ntu.edu.tw/ cjlin/libsvmtools/datasets/binary.html#epsilon, 6https://go.criteo.net/criteo-research-kaggle-display-advertising-challenge-dataset.tar.gz, 7https://www.kaggle.com/c/bosch-production-line-performance, 8https://webscope.sandbox.yahoo.com/catalog.php?datatype=c along with citations like Higgs [2], Kitsune [21], Year [3], LETOR [25]. |
| Dataset Splits | No | The paper mentions 'test set' but does not provide specific details about a validation set or how data was split into training, validation, and test subsets for reproducibility. |
| Hardware Specification | Yes | No-packing version on GPU requires atomic addition for 16-bit integers, which is not natively supported by NVIDIA V100 GPUs. |
| Software Dependencies | No | The paper mentions software like Light GBM, XGBoost, Cat Boost, and CUDA, but does not provide specific version numbers for these or other software dependencies. |
| Experiment Setup | Yes | A full description of datasets and hyperparameter settings is provided in Appendix C. |