On Computing Probabilistic Explanations for Decision Trees

Authors: Marcelo Arenas, Pablo Barceló, Miguel Romero Orth, Bernardo Subercaseaux

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In spite of the intractability results we obtain in the paper, we show experimentally that our problems can be solved over practical instances by using SAT solvers. The appendix describes experimentation both over synthetic datasets and standard datasets (MNIST) and reports empirical results.
Researcher Affiliation Academia 1 Department of Computer Science, PUC Chile 2 Institute for Mathematical and Computational Engineering, PUC Chile 3 Faculty of Engineering and Science, UAI Chile 4 Millenium Institute for Foundational Research on Data, Chile 5 CENIA Chile 6 Carnegie Mellon University, Pittsburgh, USA
Pseudocode Yes Algorithm 1: Minimal Sufficient Reason Input: Decision tree T and instance x, both of dimension n Output: A minimal sufficient reason y for x under T. 2 while true do 3 reduced false 4 for i {1, . . . , n} do 7 if Check Sufficient Reason(T, ˆy, x) then 9 reduced true 11 if ( reduced) or |y| = n then 12 return y
Open Source Code Yes The appendix describes experimentation both over synthetic datasets and standard datasets (MNIST) and reports empirical results. A recent report by Izza et al. (2022) also uses automated reasoning for this problem, although through an SMT solver. Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes]
Open Datasets Yes The appendix describes experimentation both over synthetic datasets and standard datasets (MNIST) and reports empirical results. L. Deng. The mnist database of handwritten digit images for machine learning research. IEEE Signal Processing Magazine, 29(6):141 142, 2012.
Dataset Splits Yes Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? [Yes]
Hardware Specification Yes Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [Yes]
Software Dependencies No No specific version numbers for software dependencies are provided in the main paper text.
Experiment Setup Yes In order to encode that Prz[T(z) = T(x) | z COMP(y)] δ, we will directly encode that |{z COMP(y) | T(z) = T(x)}| δ2|y| , where we assume for simplicity that the ceiling can be take safely, in order to have a value we can represent by an integer. This assumption is not crucial. As before, we will have fi variables representing whether y[i] = or not, and enforce that Pn i=1 fi k. Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? [Yes]