reproducibilityindex.ai

Explaining Model Confidence Using Counterfactuals

Authors: Thao Le, Tim Miller, Ronal Singh, Liz Sonenberg

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We therefore evaluate our explanation to know whether counterfactual explanations can improve understanding, trust, and user satisfaction in two user studies using existing methods for assessing understanding, trust and satisfaction.
Researcher Affiliation	Academia	School of Computing and Information Systems, The University of Melbourne thaol4@student.unimelb.edu.au, {tmiller, rr.singh, l.sonenberg}@unimelb.edu.au
Pseudocode	No	The paper describes algorithms conceptually but does not provide structured pseudocode or an algorithm block.
Open Source Code	No	The paper refers to a third-party tool's website (Gurobi Optimizer) but does not provide specific access to the authors' own source code for the described methodology.
Open Datasets	Yes	The data used for the income prediction task is the Adult Dataset published in UCI Machine Learning Repository (Dua and Graff 2017) that includes 32561 instances and 14 features. In the second domain, we use the IBM HR Analytics Employee Attrition Performance dataset published in Kaggle (Pavansubhash 2017), which includes 1470 instances and 34 features.
Dataset Splits	No	The paper describes the datasets used and the selection of features, but it does not specify the training, validation, or test dataset splits (e.g., percentages or counts) used for the machine learning models.
Hardware Specification	No	The paper does not explicitly describe the specific hardware (e.g., GPU/CPU models, memory) used to run the experiments.
Software Dependencies	No	The paper mentions using 'Gurobi Optimization' but does not specify a version number for this or any other software dependency.
Experiment Setup	No	The paper describes the choice of logistic regression model and details of the human-subject experiment setup, but it does not specify concrete hyperparameters or system-level training settings for the models (e.g., learning rate, batch size, epochs).