reproducibilityindex.ai

Quantized Training of Gradient Boosting Decision Trees

Authors: Yu Shi, Guolin Ke, Zhuoming Chen, Shuxin Zheng, Tie-Yan Liu

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Surprisingly, both our theoretical analysis and empirical studies show that the necessary precisions of gradients without hurting any performance can be quite low, e.g., 2 or 3 bits. Benchmarked on CPUs, GPUs, and distributed clusters, we observe up to 2 speedup of our simple quantization strategy compared with SOTA GBDT systems on extensive datasets, demonstrating the effectiveness and potential of the low-precision training of GBDT.
Researcher Affiliation	Collaboration	1Microsoft Research 2DP Technology 3Tsinghua University
Pseudocode	Yes	Algorithm 1 Histogram Construction for Leaf s
Open Source Code	No	The code will be released to the official repository of Light GBM.4
Open Datasets	Yes	Table 1: Datasets used in experiments. Name #Train #Test #Attribute Task Metric and footnotes like 5https://www.csie.ntu.edu.tw/ cjlin/libsvmtools/datasets/binary.html#epsilon, 6https://go.criteo.net/criteo-research-kaggle-display-advertising-challenge-dataset.tar.gz, 7https://www.kaggle.com/c/bosch-production-line-performance, 8https://webscope.sandbox.yahoo.com/catalog.php?datatype=c along with citations like Higgs [2], Kitsune [21], Year [3], LETOR [25].
Dataset Splits	No	The paper mentions 'test set' but does not provide specific details about a validation set or how data was split into training, validation, and test subsets for reproducibility.
Hardware Specification	Yes	No-packing version on GPU requires atomic addition for 16-bit integers, which is not natively supported by NVIDIA V100 GPUs.
Software Dependencies	No	The paper mentions software like Light GBM, XGBoost, Cat Boost, and CUDA, but does not provide specific version numbers for these or other software dependencies.
Experiment Setup	Yes	A full description of datasets and hyperparameter settings is provided in Appendix C.