reproducibilityindex.ai

Fooling Neural Network Interpretations via Adversarial Model Manipulation

Authors: Juyeon Heo, Sunghwan Joo, Taesup Moon

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our results are validated by both visually showing the fooled explanations and reporting quantitative metrics that measure the deviations from the original explanations.
Researcher Affiliation	Academia	Juyeon Heo1 , Sunghwan Joo1 , and Taesup Moon1,2 1Department of Electrical and Computer Engineering, 2Department of Artiﬁcial Intelligence Sungkyunkwan University, Suwon, Korea, 16419 heojuyeon12@gmail.com, {shjoo840, tsmoon}@skku.edu
Pseudocode	No	The paper describes methods and formulas but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	The source code is available at https://github.com/rmrisforbidden/Fooling Neural Network-Interpretations.
Open Datasets	Yes	For all our fooling methods, we used the Image Net training set [30] as our D and took three pretrained models, VGG19 [31], Res Net50 [32], and Dense Net121 [33], for carrying out the foolings.
Dataset Splits	Yes	We show the fooled explanation generalizes to the entire validation set, indicating that the interpretations are truly fooled, not just for some speciﬁc inputs, in contrast to [11, 13, 14]. [...] The accuracy drops are around only 2%/1% for Top-1/Top-5 accuracy, respectively. Table 3: Accuracy of the pre-trained models and the manipulated models on the entire Image Net validation set. [...] Figure 4(a) shows the average AOPC curves on 10K validation images for the original and manipulated Dense Net121 (Top-k fooled with Grad-CAM) models
Hardware Specification	No	The paper does not specify any hardware used for the experiments (e.g., GPU models, CPU types).
Software Dependencies	No	The paper does not list specific software dependencies with version numbers.
Experiment Setup	Yes	For all our fooling methods, we used the Image Net training set [30] as our D and took three pretrained models, VGG19 [31], Res Net50 [32], and Dense Net121 [33], for carrying out the foolings. For the Active fooling, we additionally constructed Dfool with images that contain two classes, {c1 = African Elephant , c2 = Firetruck }, by constructing each image by concatenating two images from each class in the 2 2 block. [...] We empirically deﬁned Rf as [0, 0.2], [0, 0.3], [0.1, 1], and [0.5, 2] for Location, Top-k, Center-mass, and Active fooling, respectively.