Gradient Boosting with Piece-Wise Linear Regression Trees
Authors: Yu Shi, Jian Li, Zhize Li
IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The experimental results show that GBDT with PL Trees can provide very competitive testing accuracy with comparable or less training time. |
| Researcher Affiliation | Academia | Yu Shi , Jian Li and Zhize Li Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China |
| Pseudocode | Yes | Algorithm 1 Training Process of PL Tree |
| Open Source Code | Yes | Our code, details of experiment setting and datasets is available at the github page. 1 https://github.com/GBDT-PL/GBDT-PL.git |
| Open Datasets | Yes | Our code, details of experiment setting and datasets is available at the github page. 1 https://github.com/GBDT-PL/GBDT-PL.git |
| Dataset Splits | Yes | For GBDT-PL, we seperate 20% of training data for validation, and pick the best setting on validation set, then record the corresponding accuracy on test set. |
| Hardware Specification | No | The paper mentions "modern computer architectures with powerful Single Instruction Multiple Data (SIMD) parallelism" and "Training Time on CPU" but does not specify exact CPU or GPU models, or other hardware components used for experiments. |
| Software Dependencies | No | The paper mentions "Intel MKL [Wang et al., 2014]" but does not provide a specific version number. No other software dependencies with version numbers are listed. |
| Experiment Setup | Yes | Key hyperparameters we tuned include: 1. num leaves {16, 64, 256, 1024}, which controls the size of each tree. For Cat Boost with Symmetric Tree mode, the tree is grown by level, so max depth {4, 6, 8, 10} is used instead of num leaves. 2. max bin {63, 255}, the maximum number of bins in histograms. 3. min sum hessians {1.0, 100.0}, the sum of hessians of data in each leaf. 4. learning rate {0.01, 0.05, 0.1}, the weight of each tree. 5. l2 reg {0.01, 10.0}, l2 regularization for leaf predicted values. We fix the number of regressors used in GBDT-PL to 5 in all runs. |