reproducibilityindex.ai

What Are Bayesian Neural Network Posteriors Really Like?

Authors: Pavel Izmailov, Sharad Vikram, Matthew D Hoffman, Andrew Gordon Gordon Wilson

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To investigate foundational questions in Bayesian deep learning, we instead use full-batch Hamiltonian Monte Carlo (HMC) on modern architectures. We show that (1) BNNs can achieve signiﬁcant performance gains over standard training and deep ensembles... In this section we evaluate Bayesian neural networks in various problems using our implementation of HMC.
Researcher Affiliation	Collaboration	1New York University 2Google Research.
Pseudocode	Yes	We summarize HMC in Appendix Algorithm 1 and Algorithm 2.
Open Source Code	Yes	We release our JAX (Bradbury et al., 2018) implementation.
Open Datasets	Yes	In our main evaluations we use the CIFAR image classiﬁcation datasets (Krizhevsky et al., 2014) and the IMDB dataset (Maas et al., 2011) for sentiment analysis.
Dataset Splits	No	For each of these datasets, we construct 20 random 90-to-10 train-test splits and report the mean and standard deviation of performance over the splits.
Hardware Specification	Yes	To scale HMC to modern neural network architectures and for datasets like CIFAR-10 and IMDB, we parallelize the computation over 512 TPUv3 devices (Jouppi et al., 2020).
Software Dependencies	No	We release our JAX (Bradbury et al., 2018) implementation.
Experiment Setup	Yes	We run 3 HMC chains using step size 10^-5 and a prior variance of 1/5, resulting in 70,248 leapfrog steps per sample. In each chain we discard the ﬁrst 50 samples as burn-in, and then draw 240 samples (720 in total for 3 chains).