Understanding Catastrophic Overfitting in Single-step Adversarial Training
Authors: Hoki Kim, Woojin Lee, Jaewook Lee8119-8127
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this study, we first analyze the differences before and after catastrophic overfitting. We then identify the relationship between distortion of the decision boundary and catastrophic overfitting. Through extensive experiments, we empirically discovered the relationship between single-step adversarial training and decision boundary distortion, and found that the problem of single-step adversarial training is a fixed magnitude of the perturbation, not the direction of the attack. ... First, to analyze catastrophic overfitting, we start by recording robust accuracy of fast adversarial training on CIFAR-10 (Krizhevsky, Hinton et al. 2009). ... In this section, we conduct a set of experiments on CIFAR10 (Krizhevsky, Hinton et al. 2009) and Tiny Image Net (Le and Yang 2015), using Pre Act Res Net-18 (He et al. 2016). |
| Researcher Affiliation | Academia | Seoul National University, Seoul, Korea |
| Pseudocode | Yes | Algorithm 1: Stable single-step adversarial training |
| Open Source Code | Yes | Our implementation in Py Torch (Paszke et al. 2019) with Torchattacks (Kim 2020) is available at https://github.com/Harry24k/catastrophic-overfitting. |
| Open Datasets | Yes | First, to analyze catastrophic overfitting, we start by recording robust accuracy of fast adversarial training on CIFAR-10 (Krizhevsky, Hinton et al. 2009). ... In this section, we conduct a set of experiments on CIFAR10 (Krizhevsky, Hinton et al. 2009) and Tiny Image Net (Le and Yang 2015). |
| Dataset Splits | No | The paper does not explicitly provide specific percentages, sample counts, or citations for the training, validation, and test splits. While it mentions the use of CIFAR-10 and Tiny ImageNet, which have standard splits, it does not specify the exact partitioning used for reproduction. |
| Hardware Specification | Yes | All experiments were conducted on a single NVIDIA TITAN V over three different random seeds. |
| Software Dependencies | No | The paper mentions 'Py Torch (Paszke et al. 2019) with Torchattacks (Kim 2020)' but does not specify the version numbers for these software components, which is required for reproducibility. |
| Experiment Setup | Yes | We use SGD with a learning rate of 0.01, momentum of 0.9 and weight decay of 5e-4. To check whether catastrophic overfitting occurs, we set the total epoch to 200. The learning rate decays with a factor of 0.2 at 60, 120, and 160 epochs. ... the maximum perturbation ϵ was set to 8/255. For PGD adversarial training, we use a step size of α = max(2/255, ϵ/n), where n is the number of steps. TRADES uses α = 2/255 and seven steps for generating adversarial images. Following Wong, Rice, and Kolter (2020), we use α = 1.25ϵ for fast adversarial training and the proposed method. The regularization parameter β for the gradient alignment of Grad Align is set to 0.2 as suggested by Andriushchenko and Flammarion (2020). |