A Bayesian Perspective on Generalization and Stochastic Gradient Descent

Authors: Samuel L. Smith and Quoc V. Le

ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We verify these predictions empirically.
Researcher Affiliation Industry Samuel L. Smith & Quoc V. Le Google Brain {slsmith, qvl}@google.com
Pseudocode No No explicit pseudocode or algorithm blocks were found in the paper.
Open Source Code No The paper does not provide any concrete access to source code for the methodology, nor does it explicitly state that code will be released.
Open Datasets Yes We form a small balanced training set comprising 800 images from MNIST
Dataset Splits No The paper mentions 'validation set accuracy' in Appendix E but does not provide specific details about the creation or size of a validation set used in the experiments described in the main body (e.g., in Sections 3 and 4).
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies No The paper does not provide specific software dependencies with version numbers (e.g., 'Python 3.8, PyTorch 1.9').
Experiment Setup Yes We use SGD with a momentum parameter of 0.9. Unless otherwise stated, we use a constant learning rate of 1.0 which does not depend on the batch size or decay during training. Furthermore, we train on just 1000 images, selected at random from the MNIST training set.