Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Improving Sharpness-Aware Minimization by Lookahead
Authors: Runsheng Yu, Youzhi Zhang, James Kwok
ICML 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on standard benchmark datasets also verify that the proposed method outperforms the SOTAs, and converge more effectively to flat minima. |
| Researcher Affiliation | Academia | 1Department of Computer Science and Engineering, The Hong Kong University of Science and Technology 2Centre for Artificial Intelligence and Robotics, Hong Kong Institute of Science & Innovation, CAS. |
| Pseudocode | Yes | Algorithm 1: Lookahead SAM and Optimistic Lookahead-SAM. Algorithm 2: Adaptive Optimistic SAM (AO-SAM). |
| Open Source Code | No | The paper does not provide an explicit statement or link for the open-source code of its proposed methodology. |
| Open Datasets | Yes | we use the popular image classification datasets CIFAR-10 and CIFAR-100 (Krizhevsky et al., 2009). ... we perform experiments on the Image Net dataset using Res Net-50 (He et al., 2016)... we perform NLP paraphrase identification using the pre-trained Bert-Large (Devlin et al., 2018) on the Microsoft Research Paraphrase Corpus (MRPC) dataset (Dolan & Brockett, 2005). |
| Dataset Splits | Yes | 10% of the training set is used for validation. |
| Hardware Specification | No | The paper mentions 'GPU memory' but does not specify any particular GPU model, CPU, or other hardware specifications used for experiments. |
| Software Dependencies | No | The paper mentions 'SGD optimizer' and 'cosine learning rate schedule (Loshchilov & Hutter, 2017)' but does not provide specific version numbers for software dependencies like Python, PyTorch, or CUDA. |
| Experiment Setup | Yes | Following the setup in (Jiang et al., 2023; Foret et al., 2021), we use batch size 128, initial learning rate 0.1, cosine learning rate schedule (Loshchilov & Hutter, 2017), and SGD optimizer. Learning rate η t is always set to ηt. The number of training epochs is 200. For the proposed methods, we select ρ {0.01, 0.05, 0.08, 0.1, 0.5, 0.8, 1, 1.5, 1.8, 2} by using CIFAR-10 s validation set on Res Net-18. The selected ρ is then directly used on CIFAR-100 and the other backbones. For the ct schedule in (6), since different SAM variants yield different %SAM s, we vary the hyper-parameters (κ1, κ2) so that the %SAM obtained by AO-SAM matches their %SAM values. Hyper-parameters for the other baselines are the same as their original papers. |