Piecewise Constant and Linear Regression Trees: An Optimal Dynamic Programming Approach
Authors: Mim Van Den Bos, Jacobus G. M. Van Der Linden, Emir Demirović
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The experimental results show that our methods improve scalability by one or more orders of magnitude over the state-of-the-art optimal methods while performing similarly or better in out-of-sample performance. |
| Researcher Affiliation | Academia | 1Department of Software Technology, Delft University of Technology, Delft, The Netherlands. Correspondence to: Jacobus G. M. van der Linden <J.G.M.vander Linden@tudelft.nl>. |
| Pseudocode | Yes | Algorithm 1: Depth-two tree search for a data set D and a feature set Fb. and Algorithm 2: Pseudo-code for SRT given a data set D, feature sets F and Fb, a depth budget d, a node budget n, an upper bound ub, a complexity cost λ and a minimum leaf node size M. |
| Open Source Code | Yes | We have implemented our methods in C++ using the STree D framework (Van der Linden et al., 2023) and provide it as a Python package.1 and 1https://github.com/Alg TUDelft/pystreed |
| Open Datasets | Yes | All data sets were obtained from the UCI Repository (Dua & Graff, 2017) and split into five folds. |
| Dataset Splits | Yes | All data sets were obtained from the UCI Repository (Dua & Graff, 2017) and split into five folds. For out-of-sample analysis, each method is tested on each of the five folds of the data sets with the other folds as training data. For CART, OSRT, SRT-C, SRT-SL, and SRT-L we hyper-tune the regularization parameter λ using five-fold cross-validation. |
| Hardware Specification | Yes | Hardware Experiments were run with one thread on an Intel Xeon E5-6248R 3.0GHz with 100GB RAM. Instead, we compared IAI with SRT-C separately on a local machine with an Intel i7-6600U CPU with 8GB RAM. |
| Software Dependencies | Yes | The MIP methods optimize the mean absolute error and are solved with Gurobi 9.5. |
| Experiment Setup | Yes | The piecewise constant and linear methods are trained with a maximum depth of five and four respectively. for normalized λ = 0.1, 0.01, 0.001, and 0.0001. We stop the computation after 10,000 iterations or when the change in SSE is less than 0.1%. |