A Bayesian Perspective on Generalization and Stochastic Gradient Descent
Authors: Samuel L. Smith and Quoc V. Le
ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We verify these predictions empirically. |
| Researcher Affiliation | Industry | Samuel L. Smith & Quoc V. Le Google Brain {slsmith, qvl}@google.com |
| Pseudocode | No | No explicit pseudocode or algorithm blocks were found in the paper. |
| Open Source Code | No | The paper does not provide any concrete access to source code for the methodology, nor does it explicitly state that code will be released. |
| Open Datasets | Yes | We form a small balanced training set comprising 800 images from MNIST |
| Dataset Splits | No | The paper mentions 'validation set accuracy' in Appendix E but does not provide specific details about the creation or size of a validation set used in the experiments described in the main body (e.g., in Sections 3 and 4). |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., 'Python 3.8, PyTorch 1.9'). |
| Experiment Setup | Yes | We use SGD with a momentum parameter of 0.9. Unless otherwise stated, we use a constant learning rate of 1.0 which does not depend on the batch size or decay during training. Furthermore, we train on just 1000 images, selected at random from the MNIST training set. |