reproducibilityindex.ai

Layer-Aware Analysis of Catastrophic Overfitting: Revealing the Pseudo-Robust Shortcut Dependency

Authors: Runqi Lin, Chaojian Yu, Bo Han, Hang Su, Tongliang Liu

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments demonstrate that our proposed method, Layer-Aware Adversarial Weight Perturbation (LAP), can effectively prevent CO and further enhance robustness.
Researcher Affiliation	Academia	1Sydeny AI Centre, School of Computer Science, The University of Sydney, Sydney, Australia 2Department of Computer Science, Hong Kong Baptist University, Hong Kong, China 3Department of Computer Science and Technology, Institute for AI, BNRist Center, Tsinghua University, Beijing, China.
Pseudocode	Yes	The LAP algorithm is summarized in Algorithm 1.
Open Source Code	Yes	Our implementation can be found at https:// github.com/tmllab/2024_ICML_LAP.
Open Datasets	Yes	We use three benchmark datasets, CIFAR-10, CIFAR-100 (Krizhevsky et al., 2009) and Tiny-Image Net (Netzer et al., 2011), for evaluating the performances of our proposed method.
Dataset Splits	No	The paper mentions using datasets like CIFAR-10, CIFAR-100, and Tiny-Image Net, and refers to 'test accuracy' in tables and figures. However, it does not explicitly provide specific train/validation/test dataset splits (e.g., percentages, sample counts, or citations to predefined splits) in the text.
Hardware Specification	Yes	The results are obtained on a single NVIDIA RTX 4090 GPU and averaged over 30 training epochs.
Software Dependencies	No	The paper mentions using the SGD optimizer and adhering to configurations of official repositories for baselines but does not specify software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup	Yes	We use the cyclical learning rate schedule (Smith, 2017) spanning 30 epochs, which reaches its maximum learning rate of 0.2 at the 15th epoch... We employ the SGD optimizer with a momentum of 0.9, a weight decay of 5 10 4, the L -norm for input perturbation, and the L2-norm for weight perturbation. ...we set the γ as 0.3, and the detailed setting for β can be found in Table 3.