Fast is better than free: Revisiting adversarial training
Authors: Eric Wong, Leslie Rice, J. Zico Kolter
ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To demonstrate the effectiveness of FGSM adversarial training with fast training methods, we run a number of experiments on MNIST, CIFAR10, and Image Net benchmarks. All CIFAR10 experiments in this paper are run on a single Ge Force RTX 2080ti using the Pre Act Res Net18 architecture, and all Image Net experiments are run on a single machine with four Ge Force RTX 2080tis using the Res Net50 architecture (He et al., 2016). Repositories for reproducing all experiments and the corresponding trained model weights are available at https://github.com/locuslab/fast_ adversarial. All experiments using FGSM adversarial training in this section are carried out with random initial starting points and step size α = 1.25ϵ as described in Section 4.1. |
| Researcher Affiliation | Collaboration | Eric Wong Machine Learning Department Carnegie Mellon University Pittsburgh, PA 15213, USA ericwong@cs.cmu.edu Leslie Rice Computer Science Department Carnegie Mellon University Pittsburgh, PA 15213, USA larice@cs.cmu.edu J. Zico Kolter Computer Science Department Carnegie Mellon University and Bosch Center for Artifical Intelligence Pittsburgh, PA 15213, USA zkolter@cs.cmu.edu |
| Pseudocode | Yes | Algorithm 1 PGD adversarial training for T epochs, given some radius ϵ, adversarial step size α and N PGD steps and a dataset of size M for a network fθ ... Algorithm 2 Free adversarial training for T epochs, given some radius ϵ, N minibatch replays, and a dataset of size M for a network fθ ... Algorithm 3 FGSM adversarial training for T epochs, given some radius ϵ, N PGD steps, step size α, and a dataset of size M for a network fθ |
| Open Source Code | Yes | All code for reproducing the experiments in this paper as well as pretrained model weights are at https://github.com/locuslab/fast_adversarial. |
| Open Datasets | Yes | To demonstrate the effectiveness of FGSM adversarial training with fast training methods, we run a number of experiments on MNIST, CIFAR10, and Image Net benchmarks. |
| Dataset Splits | No | The paper mentions using a 'small minibatch of training data' for early stopping to detect overfitting, but it does not specify a formal validation split (e.g., percentages or exact counts) of the dataset itself for general model evaluation. |
| Hardware Specification | Yes | All CIFAR10 experiments in this paper are run on a single Ge Force RTX 2080ti using the Pre Act Res Net18 architecture, and all Image Net experiments are run on a single machine with four Ge Force RTX 2080tis using the Res Net50 architecture (He et al., 2016). |
| Software Dependencies | No | Speedup with mixedprecision was incorporated with the Apex amp package at the O1 optimization level for Image Net experiments and O2 without loss scaling for CIFAR10 experiments. (Section 5) The paper mentions a specific software package (Apex amp) but does not provide its version number or any other software dependencies with version information. |
| Experiment Setup | Yes | All experiments using FGSM adversarial training in this section are carried out with random initial starting points and step size α = 1.25ϵ as described in Section 4.1. All PGD adversaries used at evaluation are run with 10 random restarts for 50 iterations... For N epochs, we use a cyclic learning rate that increases linearly from 0 to λ over the first N/2 epochs, then decreases linearly from λ to 0 for the remaining epochs, where λ is the maximum learning rate. For all methods, we use a batch size of 128, and SGD optimizer with momentum 0.9 and weight decay 5 × 10−4. We report the average results over 3 random seeds. The remaining parameters for learning rate schedules and number of epochs for the DAWNBench experiments are in Table 7. |