reproducibilityindex.ai

Rethinking Robustness of Model Attributions

Authors: Sandesh Kamath, Sankalp Mittal, Amit Deshpande, Vineeth N Balasubramanian

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Towards the role of model training in attributional robustness, we empirically observe that adversarially trained models have more robust attributions on smaller datasets, however, this advantage disappears in larger datasets. Our comprehensive empirical results on benchmark datasets and models used in existing work clearly support our aforementioned observations, as well as the need to rethink the evaluation of the robustness of model attributions using locality and diversity.
Researcher Affiliation	Collaboration	Sandesh Kamath1, Sankalp Mittal1, Amit Deshpande2, Vineeth N Balasubramanian1 1Indian Institute of Technology, Hyderabad 2Microsoft Research, Bengaluru
Pseudocode	No	The paper describes a greedy algorithm verbally but does not provide it in a structured pseudocode or algorithm block format.
Open Source Code	Yes	Code is made available at https://github.com/ksandeshk/LENS.
Open Datasets	Yes	Sample images from Flower dataset with Integrated Gradients (IG) before and after attributional attack. We use k = 1000 and three attributional attack variants proposed by Ghorbani, Abid, and Zou (2019)... for Simple Gradients (SG) (left) and Integrated Gradients (IG) (right) of a Squeeze Net model on Imagenet. All perturbations have ℓ norm bounded by δ = 0.3 for MNIST, δ = 0.1 for Fashion MNIST, and δ = 8/255 for GTSRB and Flower datasets.
Dataset Splits	No	The paper mentions using datasets like MNIST, Fashion MNIST, GTSRB, Flower, and ImageNet and specifies parameters for robustness evaluations (e.g., k values for top-k pixels, delta for perturbations) but does not provide specific train/validation/test dataset splits (e.g., percentages or sample counts) needed for reproduction.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU models (e.g., NVIDIA A100), CPU models, or memory specifications used for running the experiments. It only mentions using SqueezeNet and ResNet models.
Software Dependencies	No	The paper does not provide specific version numbers for software dependencies or libraries used (e.g., 'PyTorch 1.x' or 'Python 3.x').
Experiment Setup	Yes	A detailed description of our experimental setup for these results is available in Appendix C. All perturbations have ℓ norm bounded by δ = 0.3 for MNIST, δ = 0.1 for Fashion MNIST, and δ = 8/255 for GTSRB and Flower datasets. The values of t used to construct top-t attacks of Ghorbani, Abid, and Zou (2019) are t = 200 on MNIST, t = 100 on Fashion MNIST and GTSRB, t = 1000 on Flower. In the robustness evaluations for a fixed k, we use k = 100 on MNIST, Fashion MNIST, GTSRB, and k = 1000 on Flower.