reproducibilityindex.ai

The Dangers of Post-hoc Interpretability: Unjustified Counterfactual Explanations

Authors: Thibault Laugel, Marie-Jeanne Lesot, Christophe Marsala, Xavier Renard, Marcin Detyniecki

IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We apply this test to several datasets and classiﬁers and show that the risk of generating undesirable counterfactual examples is high. Additionally, we design a second test and show that state of the art post-hoc counterfactual approaches may generate unjustiﬁed explanations. The results of the Local Risk Assessment procedure are shown in Table 1. The results of the VE procedure are shown in Table 2.
Researcher Affiliation	Collaboration	Thibault Laugel1 , Marie-Jeanne Lesot1 , Christophe Marsala1 , Xavier Renard2 and Marcin Detyniecki1,2,3 1Sorbonne Universit e, CNRS, Laboratoire d Informatique de Paris 6, LIP6, F-75005 Paris, France 2AXA, Paris, France 3Polish Academy of Science, IBS PAN, Warsaw, Poland thibault.laugel@lip6.fr
Pseudocode	Yes	Algorithm 1 Local risk assessment
Open Source Code	Yes	The obtained results and code to reproduce them are available in an online repository (https://github.com/thibaultlaugel/truce).
Open Datasets	Yes	The datasets considered for these experiments include 2 low-dimensional datasets (half-moons and iris) as well as 2 real datasets: Boston Housing [Harrison and Rubinfeld, 1978] and Propublica Recidivism [Larson et al., 2016].
Dataset Splits	No	The paper only mentions a 'train-test split of the data is performed with 70%-30% proportion' but does not specify a validation set or its split percentage.
Hardware Specification	No	No specific hardware details (like GPU/CPU models, memory, or cloud instance types) were mentioned for the experimental setup.
Software Dependencies	No	No specific software dependencies with version numbers (e.g., library names like PyTorch, TensorFlow, or scikit-learn with their versions) were provided.
Experiment Setup	Yes	For each considered dataset, a train-test split of the data is performed with 70%-30% proportion, and a binary classiﬁer is trained. To mitigate the impact the choice of the classiﬁer would make, we use the same classiﬁer for every dataset, a random forest (RF) with 200 trees. However, we also train a Support Vector classiﬁer (SVC) with Gaussian kernel on one of the Boston dataset (see below) to make sure the highlighted issue is not a characteristic feature of random forests. A counterfactual example is generated using HCLS [Lash et al., 2017] with budget B = d(x, b0).