reproducibilityindex.ai

Adversarial Weight Perturbation Helps Robust Generalization

Authors: Dongxian Wu, Shu-Tao Xia, Yisen Wang

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments demonstrate that AWP indeed brings ﬂatter weight loss landscape and can be easily incorporated into various existing adversarial training methods to further boost their adversarial robustness. Through extensive experiments, we demonstrate that AWP consistently improves the adversarial robustness of state-of-the-art methods by a notable margin.
Researcher Affiliation	Academia	Dongxian Wu1,3 Shu-Tao Xia1,3 Yisen Wang2 1Tsinghua University 2Key Lab. of Machine Perception (Mo E), School of EECS, Peking University 3PCL Research Center of Networks and Communications, Peng Cheng Laboratory
Pseudocode	Yes	The complete pseudo-code of AT-AWP and extensions of AWP to other adversarial training approaches like TRADES, MART and RST are shown in Appendix D. APPENDIX D. PSEUDO-CODE
Open Source Code	Yes	https://github.com/csdongxian/AWP/tree/main/auto_attacks
Open Datasets	Yes	We train a Pre Act Res Net-18 [15] on CIFAR-10 [21] for 200 epochs
Dataset Splits	No	The paper mentions training and test sets but does not provide specific percentages, sample counts, or explicit descriptions for a validation split.
Hardware Specification	No	The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies	No	The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers like Python 3.8, CPLEX 12.4) needed to replicate the experiment.
Experiment Setup	Yes	We train a Pre Act Res Net-18 [15] on CIFAR-10 [21] for 200 epochs using vanilla AT with a piece-wise learning rate schedule (initial learning rate is 0.1, and divided by 10 at the 100-th and 150-th epoch). The training and test attacks are both 10-step PGD (PGD-10) with step size 2/255 and maximum L∞ perturbation ϵ = 8/255. For CIFAR-10 under L∞ attack with ϵ = 8/255, we train Wide Res Net-34-10 for AT, TRADES, and MART, while Wide Res Net-28-10 for Pre-training and RST, following their original papers. For pre-training, we ﬁne-tune 50 epochs using a learning rate of 0.001 as [17]. Other defenses are trained for 200 epochs using SGD with momentum 0.9, weight decay 5×10−4, and an initial learning rate of 0.1 that is divided by 10 at the 100-th and 150-th epoch. Simple data augmentations such as 32×32 random crop with 4-pixel padding and random horizontal ﬂip are applied. The training attack is PGD-10 with step size 2/255. For AWP, we set γ = 5×10−3.