reproducibilityindex.ai

NoiseGrad — Enhancing Explanations by Introducing Stochasticity to Model Weights

Authors: Kirill Bykov, Anna Hedström, Shinichi Nakajima, Marina M.-C. Höhne6132-6140

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate Noise Grad and its fusion with Smooth Grad Fusion Grad qualitatively and quantitatively with several evaluation criteria, and show that our novel approach signiﬁcantly outperforms the baseline methods. [...] Experiments On Local Explanations In this section, we explain datasets and evaluation metrics used for evaluating our proposed methods for local attribution quality. [...] Quantitative evaluation We start by examining the performance of the methods considering the four aforementioned attribution quality criteria applied to the absolute values of their respective explanations. The results are summarized in Table 1, where the methods (Baseline, SG, NG, FG) are stated in the ﬁrst column and the respective values for localization, faithfulness robustness, and sparseness in columns 2-5.
Researcher Affiliation	Academia	Kirill Bykov,1, 2 , Anna Hedstr om,1, 2 , Shinichi Nakajima1, 3 , Marina M.-C. H ohne1, 2 1 ML Group, TU Berlin, Germany 2 Understandable Machine Intelligence Lab 3 RIKEN AIP, Tokyo, Japan
Pseudocode	No	The paper does not contain any pseudocode or algorithm blocks.
Open Source Code	No	The paper states: 'The Quantus library was employed for XAI evaluation1 (Hedstr om et al. 2022). 1Code can be found at https://github.com/understandable-machine-intelligence-lab/quantus' and for global explanations: 'For generating global explanations following library was used https://github.com/Mayukhdeb/torch-dreams'. These links refer to third-party tools used in the evaluation, not the source code for the Noise Grad or Fusion Grad methodology itself.
Open Datasets	Yes	For this purpose, we construct a semi-natural dataset CMNIST (customized-MNIST), where each MNIST digit (Le Cun, Cortes, and Burges 2010) is displayed on a randomly selected CIFAR background (Krizhevsky, Hinton et al. 2009). [...] Moreover, to understand the real impact of SOTA, we use the PASCAL VOC 2012 object recognition dataset (Everingham et al. 2010) and ILSVRC-15 dataset (Russakovsky et al. 2015) for evaluation, where object segmentation masks in the forms of bounding boxes are available.
Dataset Splits	No	The paper states 'Further details on trainingand test splits, preprocessing steps and other relevant dataset statistics can be found in the Appendix.' but does not explicitly provide details on validation splits or their sizes within the main text.
Hardware Specification	No	The paper does not provide any specific hardware details such as GPU models, CPU types, or memory used for running the experiments.
Software Dependencies	No	The paper mentions the use of 'Quantus library' and 'torch-dreams' library but does not provide specific version numbers for these or any other software dependencies.
Experiment Setup	No	The paper describes a heuristic for setting noise levels for SG, NG, and FG, but does not provide concrete hyperparameter values for model training such as learning rate, batch size, or number of epochs in the main text. It defers some details to the Appendix: 'For more details on the model architectures, optimization conﬁgurations, and training results, we refer to the Appendix.'