reproducibilityindex.ai

“Why Not Other Classes?”: Towards Class-Contrastive Back-Propagation Explanations

Authors: Yipei Wang, Xiaoqian Wang

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	All experiments are implemented through Pytorch on Intel Core i9-9960X CPU @ 3.10GHz with Quadro RTX 6000 GPUs. For each model, gray-ish, red-ish and blue-ish bars represent results of 1, 2, 10 iterations respectively. For a given step, the wide shallow bar represents the change in yt, the thin median bar represents the change in pt, and the thin deep bar represents the change in accuracy.
Researcher Affiliation	Academia	Yipei Wang, Xiaoqian Wang Elmore Family School of Electrical & Computer Engineering Purdue University West Lafayette, IN 47906 {wang4865,joywang}@purdue.edu
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	No	Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [No]
Open Datasets	Yes	We propose to compare the recourse application of the weighted contrastive gradient (Weighted), original gradient (Original), mean contrastive gradient (Mean), and max contrastive gradient (Max) methods by performing the input modification based on the gradient sign perturbation over ILSVRC2012 (Deng et al., 2009) validation set, where 50000 images from 1000 classes are included. Here we carry out experiments over 5 fine-grained datasets: CUB-200 (Wah et al., 2011) containing 200 classes of birds, Fine-Grained Visual Classification of Aircraft (FGVC) (Blaschko et al., 2012) containing 100 classes of aircraft, Food-101 (Bossard et al., 2014) containing 101 classes of food, Flower-102 (Nilsback and Zisserman, 2008) containing 102 classes of flowers, and Stanford Cars (Krause et al., 2013) containing 196 classes of cars.
Dataset Splits	No	For ILSVRC2012, the paper mentions using the 'validation set' but does not specify a training split. For the other fine-grained datasets, it lists them but does not provide details on training, validation, or test splits. The paper's self-checklist also states: 'Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? [No]'.
Hardware Specification	Yes	All experiments are implemented through Pytorch on Intel Core i9-9960X CPU @ 3.10GHz with Quadro RTX 6000 GPUs.
Software Dependencies	No	All experiments are implemented through Pytorch on Intel Core i9-9960X CPU @ 3.10GHz with Quadro RTX 6000 GPUs. (No version number for Pytorch or other software dependencies is specified).
Experiment Setup	Yes	The gradient sign perturbations are implemented following the projected gradient descent (Madry et al., 2017) xn+1 xn + α sign(ϕt(x)) xn+1 clamp(xn+1, min(x ϵ, 0), max(x + ϵ, 1)) where we set n [1, 2, 10] to be the number of iterations, ϵ = 10 3 is the perturbation limit, and α = ϵ/n is the step size. Here VGG-16 is applied as the explained classifier, and heatmaps 14 14 are generated at the output of the CNN layers but before the last Max Pooling layer for higher resolutions. Then they are upsampled to the input space 224 224 by bilinear upsampling.