reproducibilityindex.ai

Sufficient Reasons for Classifier Decisions in the Presence of Domain Constraints

Authors: Niku Gorji, Sasha Rubin5660-5667

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We illustrate this improved succinctness on synthetic classifiers and classifiers learnt from real data.The workflow is illustrated in Figure 1. A classifier, typically learnt from data, is transformed into a decision function F. Domain constraints C are taken into account to produce a partial Boolean function FC, which is used to compute sufficient reasons for the classifier s decisions.In this section we validate our theory on constrained decision-functions learnt from binary data.2 We provide a prototype using a type of classifier that is often considered interpretable, i.e., decision trees. The purpose of the prototype is to provide a proof of concept that shows that by using constrained decision-functions: (1) we get no less succinct, and sometimes more succinct, reasons compared with the unconstrained setting; (2) we can seamlessly integrate two major types of constraints that can arise in AI, namely constraints due to pre-processing of data (e.g. one-hot, or other categorical, encodings), and semantic constraints that are inherent to the input domain.Case Study 1. We used the dataset of Corticosteroid Randomization after Significant Head Injury (CRASH) trial (Collaborators et al. 2008) to predict the condition of a patient after a traumatic head injury.RPART (seed: 25, train: 0.75, cp: 0.005) correctly classifies 75.69% of instances in the test set (ROC 0.77).Case Study 2. To study semantic constraints, we used the Tic-Tac-Toe (TTT) Endgame dataset from the UCI machine learning repository (Dua and Graff 2017).We trained a classifier on this dataset using RPART (seed 1, train: 0.7, cp 0.01); with 93% accuracy for the test set (ROC 0.97), see Figure 4.
Researcher Affiliation	Academia	Niku Gorji, Sasha Rubin School of Computer Science, The University of Sydney, Australia niku.gorji@sydney.edu.au, sasha.rubin@sydney.edu.au
Pseudocode	No	The paper describes algorithms verbally and refers to existing tools, but does not include any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper mentions reusing algorithms in (Shih, Choi, and Darwiche 2018) and tools for symbolic computation, but does not provide a statement or link to its own open-source code for the methodology described.
Open Datasets	Yes	Case Study 1. We used the dataset of Corticosteroid Randomization after Significant Head Injury (CRASH) trial (Collaborators et al. 2008) to predict the condition of a patient after a traumatic head injury.Case Study 2. To study semantic constraints, we used the Tic-Tac-Toe (TTT) Endgame dataset from the UCI machine learning repository (Dua and Graff 2017).
Dataset Splits	Yes	RPART (seed: 25, train: 0.75, cp: 0.005) correctly classifies 75.69% of instances in the test set (ROC 0.77).We trained a classifier on this dataset using RPART (seed 1, train: 0.7, cp 0.01); with 93% accuracy for the test set (ROC 0.97)
Hardware Specification	No	The paper does not specify any details about the hardware (e.g., CPU, GPU models, memory, or cloud resources) used for conducting the experiments.
Software Dependencies	No	The paper mentions 'RPART' and refers to 'Shih, Choi, and Darwiche 2018' for OBDD operations, but it does not provide specific version numbers for any software components used in the experiments.
Experiment Setup	Yes	RPART (seed: 25, train: 0.75, cp: 0.005) correctly classifies 75.69% of instances in the test set (ROC 0.77).We trained a classifier on this dataset using RPART (seed 1, train: 0.7, cp 0.01); with 93% accuracy for the test set (ROC 0.97)