reproducibilityindex.ai

Bayesian Neural Network Priors Revisited

Authors: Vincent Fortuin, Adrià Garriga-Alonso, Sebastian W. Ober, Florian Wenzel, Gunnar Ratsch, Richard E Turner, Mark van der Wilk, Laurence Aitchison

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this work, we study empirically whether isotropic Gaussian priors are indeed suboptimal for BNNs and whether this can explain the cold posterior effect. We analyze the performance of different BNN priors for different network architectures and compare them to the empirical weight distributions of standard SGD-trained neural networks.
Researcher Affiliation	Collaboration	Vincent Fortuin ETH Zürich, Switzerland fortuin@inf.ethz.ch Adrià Garriga-Alonso University of Cambridge, United Kingdom ag919@cam.ac.uk Sebastian W. Ober University of Cambridge, United Kingdom swo25@cam.ac.uk Florian Wenzel Google AI Berlin, Germany florianwenzel@google.com Gunnar Rätsch ETH Zürich, Switzerland raetsch@inf.ethz.ch Richard E. Turner University of Cambridge, United Kingdom ret26@eng.cam.ac.uk Mark van der Wilk Imperial College London, United Kingdom m.vdwilk@imperial.ac.uk Laurence Aitchison University of Bristol, United Kingdom laurence.aitchison@bristol.ac.uk
Pseudocode	No	The paper does not contain any sections or blocks explicitly labeled "Pseudocode" or "Algorithm".
Open Source Code	Yes	We make our library available on Github1, inviting other researchers to join us in studying the role of priors in BNNs using state-of-the-art inference. 1https://github.com/ratschlab/bnn_priors. MIT licensed.
Open Datasets	Yes	We trained an FCNN (Fig. 1, top) and a CNN (Fig. 1, middle) on MNIST (Le Cun et al., 1998). ... Next, we did a similar analysis for a Res Net20 trained on CIFAR-10 (Krizhevsky, 2009)...
Dataset Splits	No	The paper mentions using train and test sets but does not provide specific percentages or counts for training/validation/test splits, nor does it explicitly mention using a separate validation set.
Hardware Specification	Yes	We ran the experiments on GPUs of the type NVIDIA Ge Force GTX 1080 Ti and NVIDIA Ge Force RTX 2080 Ti on our local cluster.
Software Dependencies	No	We implemented the inference and models with the Py Torch library (Paszke et al., 2019). To manage our experiments and schedule runs with several settings, we used Sacred (Greff et al., 2017) and Jug (Coelho, 2017) respectively. For the diagnostics, we also use Arviz (Kumar et al., 2019). (No specific version numbers for these libraries are provided).
Experiment Setup	Yes	For all the MNIST BNN experiments, we perform 60 cycles of SG-MCMC (Zhang et al., 2019) with 45 epochs each. We draw one sample each at the end of the respective last ﬁve epochs of each cycle, thus yielding 300 samples after 2,700 epochs, out of which we discarded the ﬁrst 50 samples as a burn-in. ... We start each cycle with a learning rate of 0.01 and decay to 0 using a cosine schedule. We use a mini-batch size of 128.