reproducibilityindex.ai

Exploring Memorization in Adversarial Training

Authors: Yinpeng Dong, Ke Xu, Xiao Yang, Tianyu Pang, Zhijie Deng, Hang Su, Jun Zhu

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on various datasets validate the effectiveness of the proposed method.
Researcher Affiliation	Collaboration	Yinpeng Dong1,2, Ke Xu4, Xiao Yang1, Tianyu Pang1, Zhijie Deng1, Hang Su1,3, Jun Zhu1,2,3 1 Dept. of Comp. Sci. and Tech., Institute for AI, Tsinghua-Bosch Joint ML Center, THBI Lab 1 BNRist Center, Tsinghua University, Beijing, China; 2 Real AI; 3 Peng Cheng Laboratory; 4 CMU
Pseudocode	No	The paper describes algorithms like PGD and TRADES using mathematical formulations and textual descriptions (e.g., Section 2.1) but does not include a clearly labeled 'Pseudocode' or 'Algorithm' block.
Open Source Code	Yes	Code is available at https://github.com/dongyp13/memorization-AT.
Open Datasets	Yes	The experiments are conducted on CIFAR-10 (Krizhevsky & Hinton, 2009) with a Wide Res Net model... We also provide the experimental results on CIFAR-10, CIFAR-100 (Krizhevsky & Hinton, 2009), and SVHN (Netzer et al., 2011) datasets...
Dataset Splits	No	The paper mentions 'training' and 'test accuracies' and 'generalization gap (i.e., difference between training and test accuracies)', but does not explicitly provide the specific percentages or sample counts for training, validation, and test dataset splits.
Hardware Specification	Yes	All of the experiments are conducted on NVIDIA 2080 Ti GPUs.
Software Dependencies	No	The paper mentions the use of 'SGD optimizer' but does not specify any software versions for programming languages, deep learning frameworks (e.g., PyTorch, TensorFlow), or specific libraries.
Experiment Setup	Yes	In training, we use the 10-step PGD adversary with α = 2/255. The models are trained via the SGD optimizer with momentum 0.9, weight decay 0.0005, and batch size 128. For CIFAR-10/100, we set the learning rate as 0.1 initially which is decayed by 0.1 at 100 and 150 epochs with totally 200 training epochs. For SVHN, the learning rate starts from 0.01 with a cosine annealing schedule for a total number of 80 training epochs. In our method, We set η = 0.9 and w = 30 along a Gaussian ramp-up curve (Laine & Aila, 2017).