reproducibilityindex.ai

Optimal Sparse Regression Trees

Authors: Rui Zhang, Rui Xin, Margo Seltzer, Cynthia Rudin

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We ran experiments on 12 datasets; the details are described in Appendix C.1. Our evaluation answers the following: 1. Are trees generated by existing regression tree optimization methods truly optimal? How well do optimal sparse regression trees generalize? How far from optimal are greedy-approach models? ( 6.1)
Researcher Affiliation	Academia	Rui Zhang1, Rui Xin1, Margo Seltzer2, Cynthia Rudin1 1 Duke University 2 University of British Columbia
Pseudocode	Yes	Algorithm 1: compute lower bound(dataset, sub, λ) lower bound // For a subproblem sub and regularization λ, compute its Equivalent k-Means Lower Bound
Open Source Code	Yes	Code Availability The implementation of OSRT is available at https://github. com/ruizhang1996/optimal-sparse-regression-tree-public.
Open Datasets	Yes	An example tree for the seoul bike dataset (VE and Cho 2020; Sathishkumar, Park, and Cho 2020; Dua and Graff 2017) constructed by our method is shown in Figure 1.
Dataset Splits	Yes	Optimization experiments in Appendix D and crossvalidation experiments in Appendix H, along with a demonstration of these results in Figure 2 show: (1) trees produced by other methods are usually sub-optimal even if they claim optimality (they do not prove optimality), and only our method can consistently find the optimal trees, which are the most efficient frontiers that optimize the trade-off between loss and sparsity
Hardware Specification	No	The paper does not provide specific hardware details such as GPU/CPU models, processor types, or memory amounts used for running experiments.
Software Dependencies	No	The paper mentions software like IAI, Evtree, GUIDE, and CART, but it does not specify version numbers for these or any other ancillary software dependencies like programming languages or libraries.
Experiment Setup	Yes	Figure 1: Optimal regression tree for seoul bike dataset with λ = 0.05, max depth = 5.