Generalization Analysis of Stochastic Weight Averaging with General Sampling
Authors: Peng Wang, Li Shen, Zerui Tao, Shuaida He, Dacheng Tao
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | This paper mainly focuses on theoretical exploration, and some experimental results are intended to verify its correctness. |
| Researcher Affiliation | Collaboration | 1Huazhong University of Science and Technology, China 2Nanyang Technological University, Singapore 3Sun Yat-sen University, China 4JD Explore Academy, China 5Tokyo University of Agriculture and Technology, Japan 6RIKEN AIP, Japan 7Southern University of Science and Technology, China. |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described. |
| Open Datasets | Yes | We train a Le Net (Le Cun et al., 1998) with two convolutional layers on the MNIST dataset (Deng, 2012), and a VGG16 (Simonyan & Zisserman, 2014) on the CIFAR10 dataset (Krizhevsky et al., 2009). ...a multi-layer perceptron (MLP) on the Adult dataset in the UCI repository (Becker & Kohavi, 1996). |
| Dataset Splits | No | The paper describes how two different datasets were constructed for stability analysis but does not specify train/validation/test dataset splits in percentages or counts, nor does it explicitly mention a dedicated validation set. |
| Hardware Specification | No | The paper does not provide specific hardware details used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details with version numbers. |
| Experiment Setup | Yes | The batch size is set as 128. The learning rate is chosen from {0.1, 0.05, 0.03}. ...we set the mini-batch size as 1 and learning rate as 0.03 for MNIST dataset and 0.01 for Adult dataset, respectively. |