Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Exploring the Whole Rashomon Set of Sparse Decision Trees
Authors: Rui Xin, Chudi Zhong, Zhi Chen, Takuya Takagi, Margo Seltzer, Cynthia Rudin
NeurIPS 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our evaluation answers the following questions: 1. How does Tree FARMS compare to baseline methods for searching the hypothesis space? ( 6.1), 2. How quickly can we find the entire Rashomon set? ( 6.1), 3. What does the Rashomon set look like? What can we learn about its structure? ( G.2), 4. What does MCR look like for real datasets? ( 6.2), 5. How do balanced accuracy and F1-score Rashomon sets compare to the accuracy Rashomon set? ( 6.3), and 6. How does removing samples affect the Rashomon set? ( 6.4). |
| Researcher Affiliation | Collaboration | 1 Duke University 2 Fujitsu Laboratories Ltd. 3 The University of British Columbia |
| Pseudocode | Yes | Algorithm 1 Tree FARMS(x, y, λ, ϵ) Rset // Given a dataset (x, y), λ, and ϵ, return the set, Rset, of all trees whose objective is in θϵ. Algorithm 2 extract(G, sub, scope) (Detailed algorithm in Appendix B) |
| Open Source Code | Yes | Code Availability: Implementations of Tree FARMS is available at https://github.com/ubc-systopia/treeFarms. |
| Open Datasets | Yes | We use datasets from the UCI Machine Learning Repository [Car Evaluation, Congressional Voting Records, Monk2, and Iris, see 43], a penguin dataset [44], a criminal recidivism dataset [COMPAS, shared by 40], the Fair Isaac (FICO) credit risk dataset [45] used for the Explainable ML Challenge, and four coupon datasets (Bar, Coffee House, Cheap Restaurant, and Expensive Restaurant) [46] that come from surveys. More details are in Appendix F. |
| Dataset Splits | No | We denote the training dataset as {(xi, yi)}n i=1, where xi {0, 1}p are binary features. |
| Hardware Specification | No | The paper states that information on the type of resources used is available in Appendix F and G, but these appendices are not provided in the main paper. The main text does not include specific hardware details such as GPU/CPU models or memory. |
| Software Dependencies | No | We used the R package BART [37]. |
| Experiment Setup | Yes | Figure 1: Comparison of trees in the Rashomon set (λ = 0.01, ϵ = 0.1) and trees generated by baselines. Figure 3: Variable Importance: Model class reliance on the COMPAS and Bar (λ = 0.01, ϵ = 0.05). |