Interpreting and Evaluating Neural Network Robustness

Authors: Fuxun Yu, Zhuwei Qin, Chenchen Liu, Liang Zhao, Yanzhi Wang, Xiang Chen

IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental With extensive experiments, our metric demonstrates several advantages over conventional adversarial testing accuracy based robustness estimation: (1) it provides a uniformed evaluation to models with different structures and parameter scales; (2) it overperforms conventional accuracy based robustness estimation and provides a more reliable evaluation that is invariant to different test settings; (3) it can be fast generated without considerable testing cost.
Researcher Affiliation Academia Fuxun Yu1 , Zhuwei Qin2 , Chenchen Liu3 , Liang Zhao4 , Yanzhi Wang5 and Xiang Chen6 1,2,4,6George Mason University 3University of Maryland, Baltimore County 5Northeastern University ccliu@umbc.edu, yanz.wang@northeastern.edu, {fyu2, zqin, lzhao9, xchen26}@gmu.edu
Pseudocode No The paper does not contain any pseudocode or clearly labeled algorithm blocks.
Open Source Code No The paper mentions that "The gradient regularization and Min Max training is reimplemented with Pytorch" (Footnote 3) and provides a link to a "Min Max model" from 'facebookresearch' (Footnote 4), but it does not state that *their own* methodology's source code is available or open-sourced.
Open Datasets Yes To test the generality of our metric for neural networks robustness evaluation, we adopt three common datasets (i.e. MNIST, CIFAR10, and Image Net)
Dataset Splits No The paper mentions evaluating robustness using 'adversarial testing accuracies' and discusses 'batch size' for measurement stability, but it does not specify training, validation, and test dataset splits with percentages, counts, or references to predefined splits with citations for reproducibility.
Hardware Specification Yes To conduct the Min Max training, the reported time needed is about 52 hours on 128 V100 GPUs.
Software Dependencies No The paper mentions the use of "Pytorch" in Footnote 3, but does not provide specific version numbers for any software dependencies.
Experiment Setup Yes The adversarial perturbations are constrained by the ℓ -norm as 0.3/1.0, 8.0/255.0, 16.0/255.0 on MNIST, CIFAR10, and Image Net respectively. Correspondingly, the robustness verification is based on referencing the adversarial testing accuracies from two currently strongest attacks: 30-step PGD (PGD-30) attack based on cross-entropy loss and 30-step CW (CW-30) attacks based on C&W loss.