Extrapolated Random Tree for Regression
Authors: Yuchao Cai, Yuheng Ma, Yiwei Dong, Hanfang Yang
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In the experiments, we compare ERTR with state-of-the-art tree algorithms on real datasets to show the superior performance of our model. |
| Researcher Affiliation | Academia | 1School of Statistics, Renmin University of China 2Center for Applied Statistics, School of Statistics, Renmin University of China. Correspondence to: Hanfang Yang <hyang@ruc.edu.cn>. |
| Pseudocode | Yes | Algorithm 1 Random Tree Partition; Algorithm 2 Extrapolated Random Tree for Regression |
| Open Source Code | Yes | All code is available on Git Hub1. [Footnote 1: https://github.com/Karlmyh/ERTR] |
| Open Datasets | Yes | ABA: The Abalone dataset originally comes from biological research (Nash et al., 1994) and now it is accessible on UCI Machine Learning Repository (Dua & Graff, 2017). AIR: The Airfoil Self-Noise dataset on UCI Machine Learning Repository... ALG: The Algerian Forest Fires dataset on UCI Machine Learning Repository... |
| Dataset Splits | Yes | For each pair of (p, L), we set λ = 10 4 as the regularized parameter for ridge regression and choose V {15, 20, 25} by cross-validation. ... We take 30% of the training data as the validation set. |
| Hardware Specification | Yes | All experiments are conducted on a machine with 72-core Intel Xeon 2.60GHz and 128GB main memory. |
| Software Dependencies | No | For standard decision trees, we use the implementation by Scikit-Learn (Pedregosa et al., 2011). We use the implementation in C++2. ... We use the implementation in R3. ... We use the implementation in Python 4. The paper mentions software such as Scikit-Learn, C++, R, and Python, but does not specify their version numbers or the version numbers of specific libraries/packages beyond citing their original papers. |
| Experiment Setup | Yes | For ERTR, we use the parameter grids p {2, 3, 4, 5, 6, 7, 8}, C {0, 1} and λ {0.001, 0.01, 0.1}. V is fixed to be max( n 2 (p+2) , 5). For each node, if the number of samples in the node is less than 5, then we stop the recursive partition process of the current node. For ERF, we set the number of trees to 200 and subsample { 0.5d , 0.75d , d} features in each split procedure to look for the best cut. In addition, each base learner is trained on a { 0.8n , n, 1.2n } samples bootstrapped with replacement from D. For GBERTR, we set the number of trees to 100 and the learning rate to 0.01. |