Interpreting and Evaluating Neural Network Robustness
Authors: Fuxun Yu, Zhuwei Qin, Chenchen Liu, Liang Zhao, Yanzhi Wang, Xiang Chen
IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | With extensive experiments, our metric demonstrates several advantages over conventional adversarial testing accuracy based robustness estimation: (1) it provides a uniformed evaluation to models with different structures and parameter scales; (2) it overperforms conventional accuracy based robustness estimation and provides a more reliable evaluation that is invariant to different test settings; (3) it can be fast generated without considerable testing cost. |
| Researcher Affiliation | Academia | Fuxun Yu1 , Zhuwei Qin2 , Chenchen Liu3 , Liang Zhao4 , Yanzhi Wang5 and Xiang Chen6 1,2,4,6George Mason University 3University of Maryland, Baltimore County 5Northeastern University ccliu@umbc.edu, yanz.wang@northeastern.edu, {fyu2, zqin, lzhao9, xchen26}@gmu.edu |
| Pseudocode | No | The paper does not contain any pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | No | The paper mentions that "The gradient regularization and Min Max training is reimplemented with Pytorch" (Footnote 3) and provides a link to a "Min Max model" from 'facebookresearch' (Footnote 4), but it does not state that *their own* methodology's source code is available or open-sourced. |
| Open Datasets | Yes | To test the generality of our metric for neural networks robustness evaluation, we adopt three common datasets (i.e. MNIST, CIFAR10, and Image Net) |
| Dataset Splits | No | The paper mentions evaluating robustness using 'adversarial testing accuracies' and discusses 'batch size' for measurement stability, but it does not specify training, validation, and test dataset splits with percentages, counts, or references to predefined splits with citations for reproducibility. |
| Hardware Specification | Yes | To conduct the Min Max training, the reported time needed is about 52 hours on 128 V100 GPUs. |
| Software Dependencies | No | The paper mentions the use of "Pytorch" in Footnote 3, but does not provide specific version numbers for any software dependencies. |
| Experiment Setup | Yes | The adversarial perturbations are constrained by the ℓ -norm as 0.3/1.0, 8.0/255.0, 16.0/255.0 on MNIST, CIFAR10, and Image Net respectively. Correspondingly, the robustness verification is based on referencing the adversarial testing accuracies from two currently strongest attacks: 30-step PGD (PGD-30) attack based on cross-entropy loss and 30-step CW (CW-30) attacks based on C&W loss. |