On Warm-Starting Neural Network Training
Authors: Jordan Ash, Ryan P. Adams
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct a series of experiments across several different architectures, optimizers, and image datasets. |
| Researcher Affiliation | Collaboration | Jordan T. Ash Microsoft Research NYC ash.jordan@microsoft.com, Ryan P. Adams Princeton University rpa@princeton.edu |
| Pseudocode | No | No pseudocode or algorithm blocks were found in the paper. |
| Open Source Code | No | The paper does not provide an explicit statement about releasing source code for the methodology described or a link to a code repository. |
| Open Datasets | Yes | Models are fitted to the CIFAR-10, CIFAR-100, and SVHN image data. All models are trained using a mini-batch size of 128 and a learning rate of 0.001. |
| Dataset Splits | Yes | Presented results are on a held-out, randomly-chosen third of available data. ... validation sets composed of a random third of available data... |
| Hardware Specification | Yes | Wall-clock time is measured by assigning every model identical resources, consisting of 50GB of RAM and an NVIDIA Tesla P100 GPU. |
| Software Dependencies | No | The paper mentions optimizers (SGD, Adam [17]) but does not provide specific version numbers for any software libraries or frameworks (e.g., Python, PyTorch, TensorFlow). |
| Experiment Setup | Yes | All models are trained using a mini-batch size of 128 and a learning rate of 0.001... We explore all combinations of batch sizes {16, 32, 64, 128}, and learning rates {0.001, 0.01, 0.1}... |