Rethinking Robustness of Model Attributions

Authors: Sandesh Kamath, Sankalp Mittal, Amit Deshpande, Vineeth N Balasubramanian

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Towards the role of model training in attributional robustness, we empirically observe that adversarially trained models have more robust attributions on smaller datasets, however, this advantage disappears in larger datasets. Our comprehensive empirical results on benchmark datasets and models used in existing work clearly support our aforementioned observations, as well as the need to rethink the evaluation of the robustness of model attributions using locality and diversity.
Researcher Affiliation Collaboration Sandesh Kamath1, Sankalp Mittal1, Amit Deshpande2, Vineeth N Balasubramanian1 1Indian Institute of Technology, Hyderabad 2Microsoft Research, Bengaluru
Pseudocode No The paper describes a greedy algorithm verbally but does not provide it in a structured pseudocode or algorithm block format.
Open Source Code Yes Code is made available at https://github.com/ksandeshk/LENS.
Open Datasets Yes Sample images from Flower dataset with Integrated Gradients (IG) before and after attributional attack. We use k = 1000 and three attributional attack variants proposed by Ghorbani, Abid, and Zou (2019)... for Simple Gradients (SG) (left) and Integrated Gradients (IG) (right) of a Squeeze Net model on Imagenet. All perturbations have ℓ norm bounded by δ = 0.3 for MNIST, δ = 0.1 for Fashion MNIST, and δ = 8/255 for GTSRB and Flower datasets.
Dataset Splits No The paper mentions using datasets like MNIST, Fashion MNIST, GTSRB, Flower, and ImageNet and specifies parameters for robustness evaluations (e.g., k values for top-k pixels, delta for perturbations) but does not provide specific train/validation/test dataset splits (e.g., percentages or sample counts) needed for reproduction.
Hardware Specification No The paper does not provide specific hardware details such as GPU models (e.g., NVIDIA A100), CPU models, or memory specifications used for running the experiments. It only mentions using SqueezeNet and ResNet models.
Software Dependencies No The paper does not provide specific version numbers for software dependencies or libraries used (e.g., 'PyTorch 1.x' or 'Python 3.x').
Experiment Setup Yes A detailed description of our experimental setup for these results is available in Appendix C. All perturbations have ℓ norm bounded by δ = 0.3 for MNIST, δ = 0.1 for Fashion MNIST, and δ = 8/255 for GTSRB and Flower datasets. The values of t used to construct top-t attacks of Ghorbani, Abid, and Zou (2019) are t = 200 on MNIST, t = 100 on Fashion MNIST and GTSRB, t = 1000 on Flower. In the robustness evaluations for a fixed k, we use k = 100 on MNIST, Fashion MNIST, GTSRB, and k = 1000 on Flower.