Learning Curves for SGD on Structured Features
Authors: Blake Bordelon, Cengiz Pehlevan
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the accuracy of our theory on random feature models and wide neural networks trained with SGD on real datasets such as MNIST and CIFAR-10. |
| Researcher Affiliation | Academia | Blake Bordelon & Cengiz Pehlevan John A. Paulson School of Engineering and Applied Sciences Center for Brain Science Harvard University Cambridge, MA 02138, USA {blake bordelon,cpehlevan}@g.harvard.edu |
| Pseudocode | No | No pseudocode or algorithm blocks were found in the paper. |
| Open Source Code | Yes | REPRODUCIBILITY STATEMENT The code to reproduce the experimental components of this paper can be found here https://github.com/Pehlevan-Group/sgd_structured_features, which contains jupyter notebook files which we ran in Google Colab. |
| Open Datasets | Yes | We demonstrate the accuracy of our theory on random feature models and wide neural networks trained with SGD on real datasets such as MNIST and CIFAR-10. |
| Dataset Splits | No | The paper mentions using 'training points' and 'test set' but does not specify validation splits or other detailed splitting methodology. |
| Hardware Specification | No | The paper mentions that experiments were run in 'Google Colab' but does not provide specific hardware details (e.g., GPU/CPU models, memory). |
| Software Dependencies | No | The paper mentions using 'Neural Tangents API' but does not provide specific version numbers for software dependencies. |
| Experiment Setup | Yes | We explore in detail the effect of minibatch size, m, on learning dynamics. By varying m, we can interpolate our theory between single sample SGD (m = 1) and gradient descent on the population loss (m ). |