reproducibilityindex.ai

Robust Models Are More Interpretable Because Attributions Look Normal

Authors: Zifan Wang, Matt Fredrikson, Anupam Datta

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	With both analytical (Sec. 3) and empirical (Sec. 5) results, we show that the gradient of the model with respect to its input... We empirically demonstrate that one such type of boundary attribution, called Boundary-based Integrated Gradients (BIG), produces explanations that are more accurate than prior attribution methods (relative to ground-truth bounding box information), while mitigating the problem of baseline sensitivity that is known to impact applications of Integrated Gradients (Sundararajan et al., 2017) (Section 6)." and "5. Evaluation
Researcher Affiliation	Academia	Zifan Wang 1 Matt Fredrikson 1 Anupam Datta 1 1Carnegie Mellon University, Pittsburgh, PA 15213, USA. Correspondence to: Zifan Wang <zifan@cmu.edu>.
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	Yes	Code can be found at https://github. com/zifanw/boundary.
Open Datasets	Yes	We conduct experiments over two data distributions, Image Net (Russakovsky et al., 2015) and CIFAR-10 (Krizhevsky et al.).
Dataset Splits	No	The paper uses pre-trained models and evaluates on specific subsets of correctly-classified images from Image Net (1500) and CIFAR-10 (5000), but does not provide details on training, validation, or test splits for reproducing model training from scratch.
Hardware Specification	Yes	All computations are done using a GPU accelerator Titan RTX with a memory size of 24 GB.
Software Dependencies	Yes	All attributions are implemented with Captum (Kokhlikyan et al., 2020) and visualized with Trulens (Leino et al., 2021a). The implementation of PGDs and CW are based on Foolbox (Rauber et al., 2020; 2017) and the implementation of Auto PGD is based on the authors public repository (we only use apgd-ce and apgd-dlr losses for efficiency reasons).
Experiment Setup	Yes	Implementation details of the boundary search (by ensembling the results of PGD, CW and Auto PGD) and the hyperparameters used in our experiments, are included in Appendix B.2. Hyper-parameters for each attack can be found in Table 7. The details of our implementation are discussed in Section 5, where we show that this yields good results in practice.