reproducibilityindex.ai

SGD on Neural Networks Learns Functions of Increasing Complexity

Authors: Dimitris Kalimeris, Gal Kaplun, Preetum Nakkiran, Benjamin Edelman, Tristan Yang, Boaz Barak, Haofeng Zhang

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We perform an experimental study of the dynamics of Stochastic Gradient Descent (SGD) in learning deep neural networks for several real and synthetic classiﬁcation tasks.
Researcher Affiliation	Academia	Preetum Nakkiran Harvard University Gal Kaplun Harvard University Dimitris Kalimeris Harvard University Tristan Yang Harvard University Benjamin L. Edelman Harvard University Fred Zhang Harvard University Boaz Barak Harvard University
Pseudocode	No	The paper describes the methods and experimental setup in textual form and through figures, but does not include any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not contain any explicit statements or links indicating that the source code for the described methodology is publicly available.
Open Datasets	Yes	We consider the following binary classiﬁcation tasks: (i) Binary MNIST: predict whether the image represents a number from 0 to 4 or from 5 to 9. (ii) CIFAR-10 Animals vs Objects: predict whether the image represents an animal or an object. (iii) CIFAR-10 First 5 vs Last 5: predict whether the image is in classes {0 . . . 4} or {5 . . . 9}.
Dataset Splits	No	The paper uses well-known datasets like MNIST and CIFAR-10 and refers to 'train error' and 'test error', but it does not explicitly state the specific percentages or sample counts for training, validation, and test splits used in their experiments, nor does it mention a dedicated validation set.
Hardware Specification	No	The paper describes the neural network architectures used (CNNs, MLPs) but does not specify any hardware details such as GPU models, CPU types, or cloud resources used for running the experiments.
Software Dependencies	No	The paper mentions using 'vanilla SGD' and 'binary cross-entropy loss' but does not specify any software dependencies with version numbers, such as programming languages, libraries, or frameworks (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup	No	The paper states it uses 'standard uniform Xavier initialization', 'binary cross-entropy loss', 'vanilla SGD without regularization', and 'a relatively small step-size for SGD'. However, it does not provide specific numerical values for hyperparameters such as the exact learning rate, batch size, or number of epochs in the provided text.