Evaluating the Robustness of Interpretability Methods through Explanation Invariance and Equivariance

Authors: Jonathan Crabbé, Mihaela van der Schaar

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental By empirically measuring our metrics for explanations of models associated with various modalities and symmetry groups, we derive a set of 5 guidelines to allow users and developers of interpretability methods to produce robust explanations.
Researcher Affiliation Academia Jonathan Crabbé DAMTP University of Cambridge jc2133@cam.ac.uk Mihaela van der Schaar DAMTP University of Cambridge mv472@cam.ac.uk
Pseudocode No The paper does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code Yes The code and instructions to replicate all the results reported below are available in the public repositories https://github.com/Jonathan Crabbe/Robust XAI and https://github.com/ vanderschaarlab/Robust XAI.
Open Datasets Yes The datasets used in our experiment are presented in Table 2. We explore various modalities and symmetry groups throughout the section, as described in Table 3. For each dataset, we fit and study a classifier from the literature designed to be invariant with respect to the underlying symmetry group. Datasets mentioned include: Electrocardiograms [50, 51], Mutagenicity [53–55], Model Net40 [57–59], IMDb [60], Fashion MNIST [61], CIFAR100 [62], STL10 [65].
Dataset Splits Yes We perform a train-validation-test split of this dataset randomly (90%-5%-5%) and fit a 2-layers bag-of-word MLP on the training dataset for 20 epochs with Adam and a cosine annealing learning rate. The test set is used as a validation set in some cases, as the model generalization is never used as an evaluation criterion.
Hardware Specification Yes Almost all the empirical evaluations were run on a single machine equipped with a 64-Core AMD Ryzen Threadripper PRO 3995WX CPU and a NVIDIA RTX A4000 GPU. The only exceptions are the CIFAR100 and STL10 experiments, for which we used a Microsoft Azure virtual machine equipped with a single Tesla V100 GPU.
Software Dependencies Yes All the machines run on Python 3.10 [78] and Pytorch 1.13.1 [79].
Experiment Setup Yes The CNNs are trained to minimize the cross entropy loss for 200 epochs with early stopping and patience 10 with a learning rate of 10 3 and a weight decay of 10 5. The GNN is trained to minimize the negative log likelihood for 200 epochs with early stopping and patience 20 with a learning rate of 10 3 and a weight decay of 10 5. The Deep Set is trained to minimize the cross entropy loss for 1,000 epochs with early stopping and patience 20 with a learning rate of 10 3, a weight decay of 10 7 and a multi step learning rate scheduler with γ = 0.1.