Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Connecting Interpretability and Robustness in Decision Trees through Separation
Authors: Michal Moshkovitz, Yao-Yuan Yang, Kamalika Chaudhuri
ICML 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, in Section 6, we test the validity of the separability assumption and the quality of the new algorithm on real-world datasets that were used previously in tree-based explanation research. |
| Researcher Affiliation | Academia | 1University of California, San Diego. Correspondence to: Michal Moshkovitz <EMAIL>, Yao-Yuan Yang <EMAIL>. |
| Pseudocode | Yes | Algorithm 1 BBM-RS (BBM-Risk Score) |
| Open Source Code | Yes | The code for the experiments is available at https://github.com/yangarbiter/ interpretable-robust-trees. |
| Open Datasets | Yes | To maintain compatibility with prior work on interpretable and robust decision trees (Ustun & Rudin, 2019; Lin et al., 2020), we use the following pre-processed datasets from their repositories adult, bank, breastcancer, mammo, mushroom, spambase, careval, ο¬cobin, and campasbin. We also use some datasets from other sources such as LIBSVM (Chang & Lin, 2011) datasets and Moro et al. (2014). |
| Dataset Splits | Yes | We use a 5-fold cross-validation based on accuracy for hyperparameters selection. For DT and Rob DT, we search through 5, 10, . . . 30 for the maximum depth of the tree. For BBM-RS, we search through 5, 10, . . . 30 for the maximum number of weak learners (T). The algorithm stops when it reaches T iterations or if no weak learner can produce a weighted accuracy > 0.51. For LCPA, we search through 5, 10, . . . 30 for the maximum 0 norm of the weight vector. We set the robust radius for Rob DT and the noise level for BBM-RS to 0.05. More details about the setup of the algorithms can be found in Appendix B. We use a 5-fold cross-validation based on accuracy for hyperparameters selection. The data is randomly split into training and testing sets by 2:1. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running experiments, such as GPU models, CPU models, or memory specifications. |
| Software Dependencies | No | The paper mentions using `scikit-learn` and `LIBSVM` datasets in the context of general practices and related work (e.g., Pedregosa et al., 2011; Chang & Lin, 2011), but it does not specify exact version numbers for the software dependencies or libraries used in its own experimental setup or implementation. |
| Experiment Setup | Yes | For DT and Rob DT, we search through 5, 10, . . . 30 for the maximum depth of the tree. For BBM-RS, we search through 5, 10, . . . 30 for the maximum number of weak learners (T). The algorithm stops when it reaches T iterations or if no weak learner can produce a weighted accuracy > 0.51. For LCPA, we search through 5, 10, . . . 30 for the maximum 0 norm of the weight vector. We set the robust radius for Rob DT and the noise level for BBM-RS to 0.05. |