Random Sharpness-Aware Minimization

Authors: Yong Liu, Siqi Mai, Minhao Cheng, Xiangning Chen, Cho-Jui Hsieh, Yang You

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Further, we evaluate our proposed R-SAM on CIFAR and Image Net datasets. The experimental results illustrate that R-SAM can consistently improve the performance on Res Net and Vision Transformer (Vi T) training.
Researcher Affiliation Collaboration Yong Liu1, Siqi Mai2, Minhao Cheng3, Xiangning Chen4, Cho-Jui Hsieh4, Yang You1 1Department of Computer Science, National University of Singapore 2HPC-AI Technology Inc. 3Department of Computer Science and Engineering, HKUST 4Department of Computer Science, University of California, Los Angeles
Pseudocode Yes Algorithm 1 Random-SAM (R-SAM)
Open Source Code Yes Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes]
Open Datasets Yes To validate the performance of proposed method R-SAM, we firstly try to conduct the experiments on widely used CIFAR-10, CIFAR-100 [33] and Image Net-1k [11] datasets.
Dataset Splits Yes Following the experimental setting in vanilla SAM [18], we use basic augmentation to preprocess the input image data and the base optimizer for SAM and R-SAM is SGD with Momentum (SGD+M).
Hardware Specification Yes The experiments in this paper are implemented with JAX [6] on Google TPU-V3 chips.
Software Dependencies No The paper mentions 'implemented with JAX [6]' but does not provide specific version numbers for JAX or any other software dependencies.
Experiment Setup Yes For Res Net training, we use SGD with Momentum as the base optimizer of SAM and R-SAM. For Vi T training, the base optimizer of SAM and R-SAM is Adam W [38]. We select batch size as 128.