Smaller, more accurate regression forests using tree alternating optimization

Authors: Arman Zharmagambetov, Miguel Carreira-Perpinan

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In a wide range of datasets, we show that the resulting forests exceed the accuracy of state-of-the-art algorithms such as random forests, Ada Boost or gradient boosting, often considerably, while yielding forests that have usually fewer and shallower trees and hence fewer parameters and faster inference overall.
Researcher Affiliation Academia 1Dept. of Computer Science & Engineering, University of California, Merced, USA. Correspondence to: Arman Zharmagambetov <azharmagambetov@ucmerced.edu>, Miguel A. Carreira-Perpi n an <mcarreira-perpinan@ucmerced.edu>.
Pseudocode Yes Algorithm 1 TAO regression tree algorithm (BFS order) input: training set; initial tree T( ; Θ) of depth N0, . . . , N nodes at depth 0, . . . , , respectively R1 {1, . . ., N} repeat for d = 0 to do parfor i Nd do if i is a leaf then θi train regressor gi on reduced set Ri else θi train decision function fi on Ri compute the reduced sets of each child of i end if end parfor end for until stop prune dead subtrees of T return T
Open Source Code No The paper states "We implemented TAO in Python" and mentions a C implementation, but does not provide any concrete access information (e.g., a specific repository link or an explicit code release statement) for the source code.
Open Datasets Yes Datasets: abalone, ailerons, cpuact, CT slice; for each, we give (N, D, K) = sample size and input and output dimensionality. [...] We compare TAO with the state-of-the-art tree ensembling algorithms: Random Forests (RF) (Breiman, 2001), Extra Trees (ET) (Geurts et al., 2006), Ada Boost (Freund & Schapire, 1997) (all using the Python scikit-learn implementation; Pedregosa et al., 2011); and gradient boosting (Friedman, 2001) (using the highly optimized XGBoost implementation; Chen & Guestrin, 2016). [...] Tables 1-3 provide details for "abalone", "ailerons", "cpuact", "CT slice", "Year Prediction MSD", "SARCOS", "MNIST".
Dataset Splits No The paper mentions training each tree on a "90% random sample of the training data" and that hyperparameters "could be determined by cross-validation," but it does not explicitly state the specific training/validation/test splits or cross-validation setup used for the overall model evaluation in a reproducible manner.
Hardware Specification No The paper does not explicitly describe the specific hardware (e.g., GPU/CPU models, memory) used to run its experiments.
Software Dependencies No The paper mentions using "Python scikit-learn implementation" and "XGBoost implementation" for baseline comparisons, and "LIBLINEAR" for solving logistic regression, but it does not provide specific version numbers for these software components.
Experiment Setup Yes As for TAO, we train each tree on a 90% random sample of the training data using 40 iterations. [...] We initialize each TAO tree from a complete tree of depth and random node parameters (each node s weight vector has Gaussian (0,1) entries, and then we normalize the vector to unit length). [...] We train each tree with an ℓ1 regularizer but set its hyperparameter α to a small value (0.01). [...] Most importantly, the forest size should be as big as possible (depth , number of trees T ) but also avoiding overfitting; practically, and T could be determined by cross-validation.