On the Algorithmic Stability of Adversarial Training

Authors: Yue Xing, Qifan Song, Guang Cheng

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Beyond the theoretical analysis under simple models, we provide a theory in two-layer Re LU network with lazy training (training the hidden layer) and observe the effectiveness of the noise injection method. We also obtain empirical evidence that for deep neural networks model, proper forms of noise injection and more accurate attack calculation (e.g., PGD-k over FGM) improve the generalization error. We use simulation to illustrate how noise-injected adversarial training affects performance.
Researcher Affiliation Academia Yue Xing Department of Statistics Purdue University xing49@purdue.edu Qifan Song Department of Statistics Purdue University qfsong@purdue.edu Guang Cheng Department of Statistics Purdue University chengg@purdue.edu
Pseudocode Yes Algorithm 1 Add noise to weight and data
Open Source Code No No. We mention in the main text that we are using some implementation from other papers shared in Github.
Open Datasets Yes Besides the results in two-layer networks, we also numerically study the generalization gap using deep neural networks with CIFAR10 dataset.
Dataset Splits No The paper does not explicitly provide specific training, validation, and test dataset splits with percentages or sample counts for the experiments, although it mentions using CIFAR10 and generating 1000 samples for linear regression.
Hardware Specification No Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? No.
Software Dependencies No The paper does not specify software dependencies with version numbers.
Experiment Setup Yes To train the regression model, we train T = 500 epochs with learning rate = 0.01 and initialization (0) = 0. ... The batch size is set to 128 and the learning rate is 0.001. We train for 200 epochs using the Adam optimizer.