Random Sharpness-Aware Minimization
Authors: Yong Liu, Siqi Mai, Minhao Cheng, Xiangning Chen, Cho-Jui Hsieh, Yang You
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Further, we evaluate our proposed R-SAM on CIFAR and Image Net datasets. The experimental results illustrate that R-SAM can consistently improve the performance on Res Net and Vision Transformer (Vi T) training. |
| Researcher Affiliation | Collaboration | Yong Liu1, Siqi Mai2, Minhao Cheng3, Xiangning Chen4, Cho-Jui Hsieh4, Yang You1 1Department of Computer Science, National University of Singapore 2HPC-AI Technology Inc. 3Department of Computer Science and Engineering, HKUST 4Department of Computer Science, University of California, Los Angeles |
| Pseudocode | Yes | Algorithm 1 Random-SAM (R-SAM) |
| Open Source Code | Yes | Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes] |
| Open Datasets | Yes | To validate the performance of proposed method R-SAM, we firstly try to conduct the experiments on widely used CIFAR-10, CIFAR-100 [33] and Image Net-1k [11] datasets. |
| Dataset Splits | Yes | Following the experimental setting in vanilla SAM [18], we use basic augmentation to preprocess the input image data and the base optimizer for SAM and R-SAM is SGD with Momentum (SGD+M). |
| Hardware Specification | Yes | The experiments in this paper are implemented with JAX [6] on Google TPU-V3 chips. |
| Software Dependencies | No | The paper mentions 'implemented with JAX [6]' but does not provide specific version numbers for JAX or any other software dependencies. |
| Experiment Setup | Yes | For Res Net training, we use SGD with Momentum as the base optimizer of SAM and R-SAM. For Vi T training, the base optimizer of SAM and R-SAM is Adam W [38]. We select batch size as 128. |