Generalization Analysis of Stochastic Weight Averaging with General Sampling

Authors: Peng Wang, Li Shen, Zerui Tao, Shuaida He, Dacheng Tao

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental This paper mainly focuses on theoretical exploration, and some experimental results are intended to verify its correctness.
Researcher Affiliation Collaboration 1Huazhong University of Science and Technology, China 2Nanyang Technological University, Singapore 3Sun Yat-sen University, China 4JD Explore Academy, China 5Tokyo University of Agriculture and Technology, Japan 6RIKEN AIP, Japan 7Southern University of Science and Technology, China.
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide concrete access to source code for the methodology described.
Open Datasets Yes We train a Le Net (Le Cun et al., 1998) with two convolutional layers on the MNIST dataset (Deng, 2012), and a VGG16 (Simonyan & Zisserman, 2014) on the CIFAR10 dataset (Krizhevsky et al., 2009). ...a multi-layer perceptron (MLP) on the Adult dataset in the UCI repository (Becker & Kohavi, 1996).
Dataset Splits No The paper describes how two different datasets were constructed for stability analysis but does not specify train/validation/test dataset splits in percentages or counts, nor does it explicitly mention a dedicated validation set.
Hardware Specification No The paper does not provide specific hardware details used for running its experiments.
Software Dependencies No The paper does not provide specific ancillary software details with version numbers.
Experiment Setup Yes The batch size is set as 128. The learning rate is chosen from {0.1, 0.05, 0.03}. ...we set the mini-batch size as 1 and learning rate as 0.03 for MNIST dataset and 0.01 for Adult dataset, respectively.