On the Duality Between Sharpness-Aware Minimization and Adversarial Training

Authors: Yihao Zhang, Hangzhou He, Jingyu Zhu, Huanran Chen, Yifei Wang, Zeming Wei

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we conduct extensive experiments to show the effectiveness of SAM in improving robustness while maintaining natural performance, across multiple tasks, data modalities, and various settings.
Researcher Affiliation Academia 1Peking University 2University of California, Berkeley 3Beijing Institute of Technology 4MIT CSAIL.
Pseudocode No No pseudocode or clearly labeled algorithm block was found.
Open Source Code Yes Code is available at https: //github.com/weizeming/SAM_AT.
Open Datasets Yes We examine the robustness of SAM on CIFAR-{10,100} (Krizhevsky et al., 2009) and Tiny Image Net (Chrabaszcz et al., 2017) datasets.
Dataset Splits No The paper mentions training configurations and evaluating on datasets like CIFAR-{10,100} and Tiny Image Net, which have standard splits. However, it does not explicitly state the specific train/validation/test split percentages or sample counts used for its experiments, nor does it explicitly cite the use of standard validation splits.
Hardware Specification No No specific hardware (e.g., GPU models, CPU types, or cloud instance specifications) used for running the experiments is mentioned in the paper.
Software Dependencies No The paper mentions software like "torchattacks (Kim, 2020) framework" and "Hugging Face's transformers library (Sanh et al., 2019)", along with optimizers like "Adam W", "SGD", and "Adam". However, no specific version numbers for these software dependencies are provided.
Experiment Setup Yes We set the weight decay as 5e-4 and momentum as 0.9 and train 100 epochs with the learning rate initialized as 0.1 for SGD and 1e-3 for Adam, and is divided by 10 at the 75th and 90th epochs, respectively.