reproducibilityindex.ai

Interpreting and Evaluating Neural Network Robustness

Authors: Fuxun Yu, Zhuwei Qin, Chenchen Liu, Liang Zhao, Yanzhi Wang, Xiang Chen

IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	With extensive experiments, our metric demonstrates several advantages over conventional adversarial testing accuracy based robustness estimation: (1) it provides a uniformed evaluation to models with different structures and parameter scales; (2) it overperforms conventional accuracy based robustness estimation and provides a more reliable evaluation that is invariant to different test settings; (3) it can be fast generated without considerable testing cost.
Researcher Affiliation	Academia	Fuxun Yu1 , Zhuwei Qin2 , Chenchen Liu3 , Liang Zhao4 , Yanzhi Wang5 and Xiang Chen6 1,2,4,6George Mason University 3University of Maryland, Baltimore County 5Northeastern University ccliu@umbc.edu, yanz.wang@northeastern.edu, {fyu2, zqin, lzhao9, xchen26}@gmu.edu
Pseudocode	No	The paper does not contain any pseudocode or clearly labeled algorithm blocks.
Open Source Code	No	The paper mentions that "The gradient regularization and Min Max training is reimplemented with Pytorch" (Footnote 3) and provides a link to a "Min Max model" from 'facebookresearch' (Footnote 4), but it does not state that their own methodology's source code is available or open-sourced.
Open Datasets	Yes	To test the generality of our metric for neural networks robustness evaluation, we adopt three common datasets (i.e. MNIST, CIFAR10, and Image Net)
Dataset Splits	No	The paper mentions evaluating robustness using 'adversarial testing accuracies' and discusses 'batch size' for measurement stability, but it does not specify training, validation, and test dataset splits with percentages, counts, or references to predefined splits with citations for reproducibility.
Hardware Specification	Yes	To conduct the Min Max training, the reported time needed is about 52 hours on 128 V100 GPUs.
Software Dependencies	No	The paper mentions the use of "Pytorch" in Footnote 3, but does not provide specific version numbers for any software dependencies.
Experiment Setup	Yes	The adversarial perturbations are constrained by the ℓ -norm as 0.3/1.0, 8.0/255.0, 16.0/255.0 on MNIST, CIFAR10, and Image Net respectively. Correspondingly, the robustness veriﬁcation is based on referencing the adversarial testing accuracies from two currently strongest attacks: 30-step PGD (PGD-30) attack based on cross-entropy loss and 30-step CW (CW-30) attacks based on C&W loss.