Optimal Sparse Regression Trees
Authors: Rui Zhang, Rui Xin, Margo Seltzer, Cynthia Rudin
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We ran experiments on 12 datasets; the details are described in Appendix C.1. Our evaluation answers the following: 1. Are trees generated by existing regression tree optimization methods truly optimal? How well do optimal sparse regression trees generalize? How far from optimal are greedy-approach models? ( 6.1) |
| Researcher Affiliation | Academia | Rui Zhang1*, Rui Xin1*, Margo Seltzer2, Cynthia Rudin1 1 Duke University 2 University of British Columbia |
| Pseudocode | Yes | Algorithm 1: compute lower bound(dataset, sub, λ) lower bound // For a subproblem sub and regularization λ, compute its Equivalent k-Means Lower Bound |
| Open Source Code | Yes | Code Availability The implementation of OSRT is available at https://github. com/ruizhang1996/optimal-sparse-regression-tree-public. |
| Open Datasets | Yes | An example tree for the seoul bike dataset (VE and Cho 2020; Sathishkumar, Park, and Cho 2020; Dua and Graff 2017) constructed by our method is shown in Figure 1. |
| Dataset Splits | Yes | Optimization experiments in Appendix D and crossvalidation experiments in Appendix H, along with a demonstration of these results in Figure 2 show: (1) trees produced by other methods are usually sub-optimal even if they claim optimality (they do not prove optimality), and only our method can consistently find the optimal trees, which are the most efficient frontiers that optimize the trade-off between loss and sparsity |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU/CPU models, processor types, or memory amounts used for running experiments. |
| Software Dependencies | No | The paper mentions software like IAI, Evtree, GUIDE, and CART, but it does not specify version numbers for these or any other ancillary software dependencies like programming languages or libraries. |
| Experiment Setup | Yes | Figure 1: Optimal regression tree for seoul bike dataset with λ = 0.05, max depth = 5. |