reproducibilityindex.ai

Piecewise Constant and Linear Regression Trees: An Optimal Dynamic Programming Approach

Authors: Mim Van Den Bos, Jacobus G. M. Van Der Linden, Emir Demirović

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The experimental results show that our methods improve scalability by one or more orders of magnitude over the state-of-the-art optimal methods while performing similarly or better in out-of-sample performance.
Researcher Affiliation	Academia	1Department of Software Technology, Delft University of Technology, Delft, The Netherlands. Correspondence to: Jacobus G. M. van der Linden <J.G.M.vander Linden@tudelft.nl>.
Pseudocode	Yes	Algorithm 1: Depth-two tree search for a data set D and a feature set Fb. and Algorithm 2: Pseudo-code for SRT given a data set D, feature sets F and Fb, a depth budget d, a node budget n, an upper bound ub, a complexity cost λ and a minimum leaf node size M.
Open Source Code	Yes	We have implemented our methods in C++ using the STree D framework (Van der Linden et al., 2023) and provide it as a Python package.1 and 1https://github.com/Alg TUDelft/pystreed
Open Datasets	Yes	All data sets were obtained from the UCI Repository (Dua & Graff, 2017) and split into five folds.
Dataset Splits	Yes	All data sets were obtained from the UCI Repository (Dua & Graff, 2017) and split into five folds. For out-of-sample analysis, each method is tested on each of the five folds of the data sets with the other folds as training data. For CART, OSRT, SRT-C, SRT-SL, and SRT-L we hyper-tune the regularization parameter λ using five-fold cross-validation.
Hardware Specification	Yes	Hardware Experiments were run with one thread on an Intel Xeon E5-6248R 3.0GHz with 100GB RAM. Instead, we compared IAI with SRT-C separately on a local machine with an Intel i7-6600U CPU with 8GB RAM.
Software Dependencies	Yes	The MIP methods optimize the mean absolute error and are solved with Gurobi 9.5.
Experiment Setup	Yes	The piecewise constant and linear methods are trained with a maximum depth of five and four respectively. for normalized λ = 0.1, 0.01, 0.001, and 0.0001. We stop the computation after 10,000 iterations or when the change in SSE is less than 0.1%.