Fast is better than free: Revisiting adversarial training

Authors: Eric Wong, Leslie Rice, J. Zico Kolter

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To demonstrate the effectiveness of FGSM adversarial training with fast training methods, we run a number of experiments on MNIST, CIFAR10, and Image Net benchmarks. All CIFAR10 experiments in this paper are run on a single Ge Force RTX 2080ti using the Pre Act Res Net18 architecture, and all Image Net experiments are run on a single machine with four Ge Force RTX 2080tis using the Res Net50 architecture (He et al., 2016). Repositories for reproducing all experiments and the corresponding trained model weights are available at https://github.com/locuslab/fast_ adversarial. All experiments using FGSM adversarial training in this section are carried out with random initial starting points and step size α = 1.25ϵ as described in Section 4.1.
Researcher Affiliation Collaboration Eric Wong Machine Learning Department Carnegie Mellon University Pittsburgh, PA 15213, USA ericwong@cs.cmu.edu Leslie Rice Computer Science Department Carnegie Mellon University Pittsburgh, PA 15213, USA larice@cs.cmu.edu J. Zico Kolter Computer Science Department Carnegie Mellon University and Bosch Center for Artifical Intelligence Pittsburgh, PA 15213, USA zkolter@cs.cmu.edu
Pseudocode Yes Algorithm 1 PGD adversarial training for T epochs, given some radius ϵ, adversarial step size α and N PGD steps and a dataset of size M for a network fθ ... Algorithm 2 Free adversarial training for T epochs, given some radius ϵ, N minibatch replays, and a dataset of size M for a network fθ ... Algorithm 3 FGSM adversarial training for T epochs, given some radius ϵ, N PGD steps, step size α, and a dataset of size M for a network fθ
Open Source Code Yes All code for reproducing the experiments in this paper as well as pretrained model weights are at https://github.com/locuslab/fast_adversarial.
Open Datasets Yes To demonstrate the effectiveness of FGSM adversarial training with fast training methods, we run a number of experiments on MNIST, CIFAR10, and Image Net benchmarks.
Dataset Splits No The paper mentions using a 'small minibatch of training data' for early stopping to detect overfitting, but it does not specify a formal validation split (e.g., percentages or exact counts) of the dataset itself for general model evaluation.
Hardware Specification Yes All CIFAR10 experiments in this paper are run on a single Ge Force RTX 2080ti using the Pre Act Res Net18 architecture, and all Image Net experiments are run on a single machine with four Ge Force RTX 2080tis using the Res Net50 architecture (He et al., 2016).
Software Dependencies No Speedup with mixedprecision was incorporated with the Apex amp package at the O1 optimization level for Image Net experiments and O2 without loss scaling for CIFAR10 experiments. (Section 5) The paper mentions a specific software package (Apex amp) but does not provide its version number or any other software dependencies with version information.
Experiment Setup Yes All experiments using FGSM adversarial training in this section are carried out with random initial starting points and step size α = 1.25ϵ as described in Section 4.1. All PGD adversaries used at evaluation are run with 10 random restarts for 50 iterations... For N epochs, we use a cyclic learning rate that increases linearly from 0 to λ over the first N/2 epochs, then decreases linearly from λ to 0 for the remaining epochs, where λ is the maximum learning rate. For all methods, we use a batch size of 128, and SGD optimizer with momentum 0.9 and weight decay 5 × 10−4. We report the average results over 3 random seeds. The remaining parameters for learning rate schedules and number of epochs for the DAWNBench experiments are in Table 7.