reproducibilityindex.ai

Accelerating SGD with momentum for over-parameterized learning

Authors: Chaoyue Liu, Mikhail Belkin

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental evaluation of Ma SS for several standard architectures of deep networks, including Res Net and convolutional networks, shows improved performance over SGD, SGD+Nesterov and Adam.
Researcher Affiliation	Academia	Chaoyue Liu Department of Computer Science The Ohio State University Columbus, OH 43210 liu.2656@osu.edu Mikhail Belkin Department of Computer Science The Ohio State University Columbus, OH 43210 mbelkin@cse.ohio-state.edu
Pseudocode	Yes	A PSEUDOCODE FOR MASS Algorithm 1 : Ma SS Momentum-added Stochastic Solver
Open Source Code	Yes	Code url: https://github.com/ts66395/Ma SS
Open Datasets	Yes	Real data: MNIST and CIFAR-10. We compare the optimization performance of SGD, SGD+Nesterov and Ma SS on the following tasks: classiﬁcation of MNIST with a fullyconnected network (FCN), classiﬁcation of CIFAR-10 with a convolutional neural network (CNN) and Gaussian kernel regression on MNIST.
Dataset Splits	No	The paper mentions training and testing but does not specify details about a validation dataset split or how it was used.
Hardware Specification	No	GPUs donated by Nvidia were used for the experiments. (This is too general, lacking specific models or types.)
Software Dependencies	No	The paper does not specify versions for any software components used in the experiments (e.g., Python, PyTorch, TensorFlow, etc.).
Experiment Setup	Yes	All algorithms are implemented with mini batches of size 64 for neural network training. In each task, we use the same initial learning rate for Ma SS, SGD and SGD+Nesterov, and run the same number of epochs (150 epochs for CNN and 300 epochs for Res Net-32). CNN: η = 0.01 (initial), α = 0.05, κm = 3; η = 0.3 (initial), α = 0.05, κm = 6. Res Net-32: η = 0.1 (initial), α = 0.05, κm = 2; η = 0.3 (initial), α = 0.05, κm = 24.