reproducibilityindex.ai

Axiomatic Aggregations of Abductive Explanations

Authors: Gagan Biradar, Yacine Izza, Elita Lobo, Vignesh Viswanathan, Yair Zick

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We also evaluate them on multiple datasets and show that these explanations are robust to the attacks that fool SHAP and LIME. and Empirical. We empirically evaluate our measures, comparing them with well-known feature importance measures: SHAP (Lundberg and Lee 2017) and LIME (Ribeiro, Singh, and Guestrin 2016). Our experimental results demonstrate the robustness of our methods, showing speciﬁcally that they are capable of identifying biases in a model that SHAP and LIME cannot identify.
Researcher Affiliation	Academia	1University of Massachusetts, Amherst, USA 2CREATE, National University of Singapore, Singapore {gbiradar,elobo,vviswanathan,yzick}@umass.edu, izza@comp.nus.edu.sg
Pseudocode	No	The paper describes the mathematical formulations and properties of the proposed aggregation methods but does not provide structured pseudocode or algorithm blocks.
Open Source Code	Yes	Implementation details4 of each attack are outlined in the extended version of the paper (Biradar et al. 2023). (Footnote 4 refers to 'Code available at https://github.com/elitalobo/aggrxp')
Open Datasets	Yes	Compas (Angwin et al. 2016): This dataset contains information about the demographics, criminal records, and Compas risk scores of 6172 individual defendants from Broward County, Florida. and German Credit (Dua and Graff 2017): This dataset contains ﬁnancial and demographic information on 1000 loan applicants.
Dataset Splits	No	We split a given dataset into train and test datasets in all our experiments. We use the training dataset to train OOD classiﬁers for the LIME and SHAP attacks and the test dataset to evaluate our methods robustness. (Only train and test splits are explicitly mentioned, not a validation set or specific proportions for a three-way split.)
Hardware Specification	No	The paper does not provide specific hardware details such as GPU/CPU models, processor types, or memory used for running the experiments.
Software Dependencies	No	The paper mentions using LIME and SHAP libraries but does not provide specific version numbers for these or any other software dependencies required to replicate the experiment.
Experiment Setup	Yes	Experimental Setup. We split a given dataset into train and test datasets in all our experiments. We use the training dataset to train OOD classiﬁers for the LIME and SHAP attacks and the test dataset to evaluate our methods robustness. To generate explanations using our proposed AXp aggregators, we must ﬁrst compute the set of all AXp s for the adversarial classiﬁer model. We do this using the MARCO algorithm (Lifﬁton et al. 2016). After generating the complete set of AXp s for the adversarial classiﬁer, we compute the feature importance scores using each of our methods the Holler-Packel index, Deegan-Packel index, and the Responsibility index. We compare our methods with LIME and SHAP, computed using their respective publicly available libraries (Lundberg and Lee 2017; Ribeiro, Singh, and Guestrin 2016).