reproducibilityindex.ai

On Linear Stability of SGD and Input-Smoothness of Neural Networks

Authors: Chao Ma, Lexing Ying

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Figure 1 shows the results for a fully-connected network trained on Fashion MNIST dataset and a VGG-11 network trained on CIFAR10 dataset.
Researcher Affiliation	Academia	Chao Ma Department of Mathematics Stanford University Stanford, CA 94305 chaoma@stanford.edu Lexing Ying Department of Mathematics Stanford University Stanford, CA 94305 lexing@stanford.edu
Pseudocode	No	The paper does not contain any sections or figures explicitly labeled 'Pseudocode' or 'Algorithm'.
Open Source Code	Yes	Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes] See Section F and the URL therein.
Open Datasets	Yes	Figure 1 shows the results for a fully-connected network trained on Fashion MNIST dataset and a VGG-11 network trained on CIFAR10 dataset.
Dataset Splits	No	The paper mentions training and testing sets, but does not provide specific training/validation/test dataset splits (e.g., percentages or counts) or reference predefined splits with citations for reproducibility.
Hardware Specification	Yes	All experiments were run on a single NVIDIA 2080 Ti GPU.
Software Dependencies	Yes	All models are implemented in PyTorch 1.9.1 with CUDA 11.1, and trained with Python 3.8.
Experiment Setup	Yes	For Fashion MNIST, the fully connected network has 2 hidden layers with 1024 neurons each, and uses ReLU activation. It is trained for 50 epochs with a batch size of 100, and initial learning rate 0.1, which is decayed by 0.1 at epoch 20 and 40. For CIFAR10, we use VGG-11 [29] (with batch normalization) without data augmentation. It is trained for 150 epochs with a batch size of 128, and initial learning rate 0.1, which is decayed by 0.1 at epoch 75 and 125. The SGD optimizer is used for all experiments with momentum 0.9 and weight decay 5e-4.