Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
A Bayesian Perspective on Training Speed and Model Selection
Authors: Clare Lyle, Lisa Schut, Robin Ru, Yarin Gal, Mark van der Wilk
NeurIPS 2020 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We verify our results in model selection tasks for linear models and for the in๏ฌnite-width limit of deep neural networks. We further provide encouraging empirical evidence that the intuition developed in these settings also holds for deep neural networks trained with stochastic gradient descent. |
| Researcher Affiliation | Academia | OATML Group, University of Oxford. Correspondence to EMAIL Imperial College London |
| Pseudocode | Yes | Algorithm 1: Marginal Likelihood Estimation for Linear Models |
| Open Source Code | No | No explicit statement providing access to open-source code for the methodology described in this paper. |
| Open Datasets | Yes | We construct a synthetic dataset inspired by Wilson and Izmailov [46]... Here we evaluate the relative change in the log ML of a Gaussian Process induced by a fully-connected MLP (MLP-NTK-GP) and a convolutional neural network (Conv-NTK-GP) which performs regression on the MNIST dataset... In this section, we evaluate whether this conjecture holds for a simple convolutional neural network trained on the Fashion MNIST dataset... We find the same trend holds for CIFAR-10, which is shown in Appendix B.3. |
| Dataset Splits | No | No explicit percentages, sample counts, or detailed splitting methodology (e.g., '80/10/10 split') for training, validation, and test sets are provided in the main text. Appendix B.2 mentions 'Fashion MNIST dataset' and '20 epochs' but no specific splits. |
| Hardware Specification | Yes | All models are trained using PyTorch (Paszke et al., 2019) on NVIDIA GeForce GTX TITAN X GPUs. |
| Software Dependencies | No | No specific version numbers are provided for software dependencies. The paper mentions 'PyTorch (Paszke et al., 2019)' but without a version number. |
| Experiment Setup | Yes | For all networks, we used the Adam optimizer (Kingma and Ba, 2014) with a batch size of 128 and a learning rate of 1e-4. The models were trained for 20 epochs. |