Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Make Sharpness-Aware Minimization Stronger: A Sparsified Perturbation Approach
Authors: Peng Mi, Li Shen, Tianhe Ren, Yiyi Zhou, Xiaoshuai Sun, Rongrong Ji, Dacheng Tao
NeurIPS 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experimental results on CIFAR10, CIFAR100, and Image Net-1K confirm the superior efficiency of our method to SAM, and the performance is preserved or even better with a perturbation of merely 50% sparsity. |
| Researcher Affiliation | Collaboration | 1Media Analytics and Computing Laboratory, Department of Artificial Intelligence, School of Informatics, Xiamen University, China 2JD Explore Academy, Beijing, China 3The University of Sydney, Australia |
| Pseudocode | Yes | Algorithm 1 Sparse SAM (SSAM) and Algorithm 2 Sparse Mask Generation |
| Open Source Code | Yes | Code is available at https: //github.com/Mi-Peng/Sparse-Sharpness-Aware-Minimization. |
| Open Datasets | Yes | Datasets. We use CIFAR10/CIFAR100 [33] and Image Net-1K [8] as the benchmarks of our method. |
| Dataset Splits | Yes | CIFAR10 and CIFAR100 have 50,000 images of 32 32 resolution for training, while 10,000 images for test. Image Net-1K [8] is the most widely used benchmark for image classification, which has 1,281,167 images of 1000 classes and 50,000 images for validation. |
| Hardware Specification | No | The paper does not specify the exact hardware components (e.g., specific GPU models, CPU types, or memory amounts) used for running the experiments. The self-reported checklist also states '[No]' for 'type of resources used'. |
| Software Dependencies | No | The paper mentions software frameworks generally but does not provide specific version numbers for any key software components or libraries (e.g., PyTorch 1.x, Python 3.x). |
| Experiment Setup | Yes | The models on CIFAR10/CIFAR100 are trained with 128 batch size for 200 epochs. We apply the random crop, random horizontal flip, normalization and cutout [11] for data augmentation, and the initial learning rate is 0.05 with a cosine learning rate schedule. The momentum and weight decay of SGD are set to 0.9 and 5e-4, respectively. SAM and SSAM apply the same settings, except that weight decay is set to 0.001 [14]. We determine the perturbation magnitude ρ from {0.01, 0.02, 0.05, 0.1, 0.2, 0.5} via grid search. In CIFAR10 and CIFAR100, we set ρ as 0.1 and 0.2, respectively. For Image Net-1K, ...train Res Net with a batch size of 256, and adopt the cosine learning rate schedule with initial learning rate 0.1. The momentum and weight decay of SGD is set as 0.9 and 1e-4. SAM and SSAM use the same settings as above. The perturbation magnitude ρ is set to 0.07. |