Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
On Tackling Explanation Redundancy in Decision Trees
Authors: Yacine Izza, Alexey Ignatiev, Joao Marques-Silva
JAIR 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | This paper offers both theoretical and experimental arguments demonstrating that, as long as interpretability of decision trees equates with succinctness of explanations, then decision trees ought not be deemed interpretable. ... In addition, the paper includes experimental results substantiating that path explanation redundancy is observed ubiquitously in decision trees, including those obtained using different tree learning algorithms, but also in a wide range of publicly available decision trees. |
| Researcher Affiliation | Academia | Yacine Izza EMAIL University of Toulouse, Toulouse, France Alexey Ignatiev EMAIL Monash University, Melbourne, Australia Joao Marques-Silva EMAIL IRIT, CNRS, Toulouse, France |
| Pseudocode | Yes | Algorithm 1 summarizes the main steps of the proposed approach for computing an APXp for a concrete path Pk. ... Algorithm 2: Checking consistent path to prediction in K \ {c} |
| Open Source Code | Yes | Sources are provided as a Python package and available in https://github.com/yizza91/xpg |
| Open Datasets | Yes | The assessment is performed on a selection of 67 publicly available datasets, which originate from UCI Machine Learning Repository (UCI, 2020), Penn Machine Learning Benchmarks (Penn ML, 2020), and Open ML repository (Open ML, 2020). |
| Dataset Splits | No | The paper states that an assessment is performed on publicly available datasets and reports 'test accuracy %A' in tables, implying the use of dataset splits. However, it does not explicitly provide specific percentages (e.g., '80/20 train/test split'), sample counts for each split, or detailed methodology on how these splits were generated (e.g., random seed, stratified sampling), nor does it reference predefined splits with explicit citations for the datasets used. |
| Hardware Specification | Yes | The experiments are performed on a Mac Book Pro with a Dual-Core Intel Core i5 2.3GHz CPU with 8GByte RAM running mac OS Catalina. |
| Software Dependencies | No | The poly-time explanation-redundancy check algorithm presented in (Izza et al., 2020) and AXp extraction by Tree Traversal outlined in Section 5.2 are implemented in Perl. (An implementation using Py SAT (Ignatiev et al., 2018a) toolkit and the solver Glucose, was instrumented in validating the results, but for the DTs considered, it was in general slower by at least one order of magnitude.) Additionally, the Propositional Horn Encoding approach outlined in Section 5.3 as well as the enumeration of AXp s/CXp s described in Section 5.5, are implemented in Python. |
| Experiment Setup | Yes | ITI is run with the pruning option enabled, which helps avoiding overfitting and aims at constructing shallow DTs. To enforce IAI to produce shallow DTs and achieve high accuracy, it is set to use the optimal tree classifier method with the maximal depth of 6. |