reproducibilityindex.ai

Learning how to explain neural networks: PatternNet and PatternAttribution

Authors: Pieter-Jan Kindermans, Kristof T. Schütt, Maximilian Alber, Klaus-Robert Müller, Dumitru Erhan, Been Kim, Sven Dähne

ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We analyze the performance of existing explanation approaches in the controlled setting of a linear model (Sections 2 and 3). We propose two novel explanation methods Pattern Net and Pattern Attribution that alleviate shortcomings of current approaches, as discovered during our analysis, and improve explanations in real-world deep neural networks visually and quantitatively (Sections 4 and 5). To evaluate the quality of the explanations, we focus on the task of image classiﬁcation. We used Theano (Bergstra et al., 2010) and Lasagne (Dieleman et al., 2015) for our implementation. We restrict the analysis to the well-known Image Net dataset (Russakovsky et al., 2015) using the pre-trained VGG-16 model (Simonyan & Zisserman, 2015).
Researcher Affiliation	Collaboration	Pieter-Jan Kindermans Google Brain pikinder@google.com Kristof T. Sch utt & Maximilian Alber TU Berlin {kristof.schuett,maximilian.alber}@tu-berlin.de Klaus-Robert M uller TU Berlin klaus-robert.mueller@tu-berlin.de Dumitru Erhan & Been Kim Google Brain {dumitru,beenkim}@google.com Sven D ahne TU Berlin sven.daehne@tu-berlin.de Part of this work was done at TU Berlin, part of the work was part of the Google Brain Residency program. KRM is also with Korea University and Max Planck Institute for Informatics, Saarbr ucken, Germany Sven D ahne is now at Amazon
Pseudocode	Yes	A ALGORITHMS In this section we will give an overview of the visualization algorithms to clarify their actual implementation for Re Lu networks. This shows the similarities and the differences between all approaches. For all visualization approaches, the back-projection through a max-pooling layer is only through the path that was active in the forward pass. A.1 FUNCTION VISUALISATION A.1.1 GRADIENT WITH RESPECT TO THE INPUT ... A.2 SIGNAL VISUALIZATION A.2.1 DECONVNET ... A.3 ATTRIBUTION VISUALIZATION A.3.1 DEEP-TAYLOR DECOMPOSITION
Open Source Code	No	The paper states that for the comparison to Prediction-Differences analysis, they used "the open-source code provided by the authors" (referring to Zintgraf et al. (2017)), but it does not provide an explicit statement or link for their own implementation's source code.
Open Datasets	Yes	We restrict the analysis to the well-known Image Net dataset (Russakovsky et al., 2015) using the pre-trained VGG-16 model (Simonyan & Zisserman, 2015).
Dataset Splits	Yes	The signal estimators are trained on the ﬁrst half of the training dataset. The vector v, used to measure the quality of the signal estimator ρ(x) in Eq. (1), is optimized on the second half of the training dataset. All the results presented here were obtained using the ofﬁcial validation set of 50000 samples. The validation set was not used for training the signal estimators, nor for training the vector v to measure the quality. Consequently our results are obtained on previously unseen data.
Hardware Specification	Yes	This was implemented on a NVIDIA Tesla K40 and took about 24 hours per optimized signal estimator.
Software Dependencies	No	We used Theano (Bergstra et al., 2010) and Lasagne (Dieleman et al., 2015) for our implementation. We optimize the equivalent least-squares problem using stochastic mini-batch gradient descent with ADAM Kingma & Ba (2015) until convergence. While software packages are mentioned, specific version numbers for Theano, Lasagne, or ADAM are not provided.
Experiment Setup	Yes	Images were rescaled and cropped to 224x224 pixels. The signal estimators are trained on the ﬁrst half of the training dataset. We optimize the equivalent least-squares problem using stochastic mini-batch gradient descent with ADAM Kingma & Ba (2015) until convergence.