reproducibilityindex.ai

SGD Converges to Global Minimum in Deep Learning via Star-convex Path

Authors: Yi Zhou, Junjie Yang, Huishuai Zhang, Yingbin Liang, Vahid Tarokh

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our argument exploits the following two important properties: 1) the training loss can achieve zero value (approximately), which has been widely observed in deep learning; 2) SGD follows a star-convex path, which is veriﬁed by various experiments in this paper. and our experiments establish strong empirical evidences that SGD (when training the loss to zero value) follows a star-convex path.
Researcher Affiliation	Collaboration	Yi Zhou , Junjie Yang , Huishuai Zhang , Yingbin Liang , Vahid Tarokh Duke University, University of Science and Technology of China Microsoft Research, Asia, The Ohio State University
Pseudocode	No	No structured pseudocode or algorithm blocks were found in the paper. The SGD update rule is described in text.
Open Source Code	No	No statement or link providing concrete access to source code for the methodology described in this paper was found.
Open Datasets	Yes	we train a standard multi-layer perceptron (MLP) network Krizhevsky (2009), a variant of Alexnet and a variant of Inception network Zhang et al. (2017a) on the CIFAR10 dataset Krizhevsky (2009) using SGD under crossentropy loss. and We train the aforementioned three types of neural networks, i.e., MLP, Alexnet and Inception, on CIFAR10 Krizhevsky (2009) and MNIST Lecun et al. (1998) dataset using SGD.
Dataset Splits	No	The paper does not provide specific dataset split information (exact percentages, sample counts, or detailed splitting methodology) for training, validation, or testing.
Hardware Specification	No	The paper does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for running the experiments. It only implies the use of computing resources for training neural networks.
Software Dependencies	No	The paper does not provide specific ancillary software details with version numbers (e.g., libraries, frameworks, or solver versions) needed to replicate the experiment.
Experiment Setup	Yes	In all experiments, we adopt a constant learning rate (0.01 for MLP and Alexnet, 0.1 for Inception) and a constant mini-batch size 128.