reproducibilityindex.ai

Fooling SHAP with Stealthily Biased Sampling

Authors: gabriel laberge, Ulrich Aïvodji, Satoshi Hara, Mario Marchand, Foutse Khomh

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimentally (Section 5), we illustrate the impact of the proposed manipulation attack on a synthetic dataset and four popular datasets, namely Adult Income, COMPAS, Marketing, and Communities. We observed that the proposed attack can reduce the importance of a sensitive feature while keeping the data manipulation undetected by the audit.
Researcher Affiliation	Academia	1Polytechnique Montréal, Québec 2École de technologie supérieure, Québec 3Osaka University, Japan 4Universitié de Laval à Québec
Pseudocode	Yes	Algorithm 1 Compute non-uniform weights
Open Source Code	Yes	The source code of all our experiments is available online3.
Open Datasets	Yes	We consider four standard datasets from the FAcc T literature, namely COMPAS, Adult-Income, Marketing, and Communities.
Dataset Splits	Yes	The datasets were ﬁrst divided into train/test subsets with ratio 4:5. The models were trained on the training set and evaluated on the test set. All categorical features for COMPAS, Adult, and Marketing were one-hot-encoded which resulted in a total of 11, 40, and 61 columns for each dataset respectively. A simple 50 steps random search was conducted to ﬁne-tune the hyper-parameters with cross-validation on the training set.
Hardware Specification	No	No specific hardware details (like GPU models, CPU types, or memory) used for running experiments are mentioned in the paper.
Software Dependencies	No	The paper mentions the 'SHAP Python library' and that some parts were rewritten in 'C++', but no specific version numbers for these or other software dependencies are provided.
Experiment Setup	Yes	Three models were considered for the two datasets: Multi-Layered Perceptrons (MLP), Random Forests (RF), and e Xtreme Gradient Boosted trees (XGB). One model of each type was ﬁtted on each dataset for 5 different train/test splits seeds, resulting in 60 models total. A simple 50 steps random search was conducted to ﬁne-tune the hyper-parameters with cross-validation on the training set.