Understanding Catastrophic Overfitting in Single-step Adversarial Training

Authors: Hoki Kim, Woojin Lee, Jaewook Lee8119-8127

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this study, we first analyze the differences before and after catastrophic overfitting. We then identify the relationship between distortion of the decision boundary and catastrophic overfitting. Through extensive experiments, we empirically discovered the relationship between single-step adversarial training and decision boundary distortion, and found that the problem of single-step adversarial training is a fixed magnitude of the perturbation, not the direction of the attack. ... First, to analyze catastrophic overfitting, we start by recording robust accuracy of fast adversarial training on CIFAR-10 (Krizhevsky, Hinton et al. 2009). ... In this section, we conduct a set of experiments on CIFAR10 (Krizhevsky, Hinton et al. 2009) and Tiny Image Net (Le and Yang 2015), using Pre Act Res Net-18 (He et al. 2016).
Researcher Affiliation Academia Seoul National University, Seoul, Korea
Pseudocode Yes Algorithm 1: Stable single-step adversarial training
Open Source Code Yes Our implementation in Py Torch (Paszke et al. 2019) with Torchattacks (Kim 2020) is available at https://github.com/Harry24k/catastrophic-overfitting.
Open Datasets Yes First, to analyze catastrophic overfitting, we start by recording robust accuracy of fast adversarial training on CIFAR-10 (Krizhevsky, Hinton et al. 2009). ... In this section, we conduct a set of experiments on CIFAR10 (Krizhevsky, Hinton et al. 2009) and Tiny Image Net (Le and Yang 2015).
Dataset Splits No The paper does not explicitly provide specific percentages, sample counts, or citations for the training, validation, and test splits. While it mentions the use of CIFAR-10 and Tiny ImageNet, which have standard splits, it does not specify the exact partitioning used for reproduction.
Hardware Specification Yes All experiments were conducted on a single NVIDIA TITAN V over three different random seeds.
Software Dependencies No The paper mentions 'Py Torch (Paszke et al. 2019) with Torchattacks (Kim 2020)' but does not specify the version numbers for these software components, which is required for reproducibility.
Experiment Setup Yes We use SGD with a learning rate of 0.01, momentum of 0.9 and weight decay of 5e-4. To check whether catastrophic overfitting occurs, we set the total epoch to 200. The learning rate decays with a factor of 0.2 at 60, 120, and 160 epochs. ... the maximum perturbation ϵ was set to 8/255. For PGD adversarial training, we use a step size of α = max(2/255, ϵ/n), where n is the number of steps. TRADES uses α = 2/255 and seven steps for generating adversarial images. Following Wong, Rice, and Kolter (2020), we use α = 1.25ϵ for fast adversarial training and the proposed method. The regularization parameter β for the gradient alignment of Grad Align is set to 0.2 as suggested by Andriushchenko and Flammarion (2020).