Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Understanding Catastrophic Overfitting in Single-step Adversarial Training

Authors: Hoki Kim, Woojin Lee, Jaewook Lee8119-8127

AAAI 2021 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this study, we ﬁrst analyze the differences before and after catastrophic overﬁtting. We then identify the relationship between distortion of the decision boundary and catastrophic overﬁtting. Through extensive experiments, we empirically discovered the relationship between single-step adversarial training and decision boundary distortion, and found that the problem of single-step adversarial training is a ﬁxed magnitude of the perturbation, not the direction of the attack. ... First, to analyze catastrophic overﬁtting, we start by recording robust accuracy of fast adversarial training on CIFAR-10 (Krizhevsky, Hinton et al. 2009). ... In this section, we conduct a set of experiments on CIFAR10 (Krizhevsky, Hinton et al. 2009) and Tiny Image Net (Le and Yang 2015), using Pre Act Res Net-18 (He et al. 2016).
Researcher Affiliation	Academia	Seoul National University, Seoul, Korea
Pseudocode	Yes	Algorithm 1: Stable single-step adversarial training
Open Source Code	Yes	Our implementation in Py Torch (Paszke et al. 2019) with Torchattacks (Kim 2020) is available at https://github.com/Harry24k/catastrophic-overﬁtting.
Open Datasets	Yes	First, to analyze catastrophic overﬁtting, we start by recording robust accuracy of fast adversarial training on CIFAR-10 (Krizhevsky, Hinton et al. 2009). ... In this section, we conduct a set of experiments on CIFAR10 (Krizhevsky, Hinton et al. 2009) and Tiny Image Net (Le and Yang 2015).
Dataset Splits	No	The paper does not explicitly provide specific percentages, sample counts, or citations for the training, validation, and test splits. While it mentions the use of CIFAR-10 and Tiny ImageNet, which have standard splits, it does not specify the exact partitioning used for reproduction.
Hardware Specification	Yes	All experiments were conducted on a single NVIDIA TITAN V over three different random seeds.
Software Dependencies	No	The paper mentions 'Py Torch (Paszke et al. 2019) with Torchattacks (Kim 2020)' but does not specify the version numbers for these software components, which is required for reproducibility.
Experiment Setup	Yes	We use SGD with a learning rate of 0.01, momentum of 0.9 and weight decay of 5e-4. To check whether catastrophic overﬁtting occurs, we set the total epoch to 200. The learning rate decays with a factor of 0.2 at 60, 120, and 160 epochs. ... the maximum perturbation ϵ was set to 8/255. For PGD adversarial training, we use a step size of α = max(2/255, ϵ/n), where n is the number of steps. TRADES uses α = 2/255 and seven steps for generating adversarial images. Following Wong, Rice, and Kolter (2020), we use α = 1.25ϵ for fast adversarial training and the proposed method. The regularization parameter β for the gradient alignment of Grad Align is set to 0.2 as suggested by Andriushchenko and Flammarion (2020).