reproducibilityindex.ai

Dangers of Bayesian Model Averaging under Covariate Shift

Authors: Pavel Izmailov, Patrick Nicholson, Sanae Lotfi, Andrew G. Wilson

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate BNNs against two deterministic baselines: a MAP solution approximated with stochastic gradient descent (SGD) with momentum [Robbins and Monro, 1951, Polyak, 1964] and a deep ensemble of 10 independently trained MAP solutions [Lakshminarayanan et al., 2017]. For BNNs, we provide the results using a Gaussian prior and a more heavy-tailed Laplace prior following Fortuin et al. [2021]. We run all methods on the MNIST [Le Cun et al., 2010] and CIFAR-10 [Krizhevsky et al., 2014] datasets.
Researcher Affiliation	Collaboration	Pavel Izmailov NYU Patrick Nicholson Covera Health Sanae Lotﬁ NYU Andrew Gordon Wilson NYU
Pseudocode	No	The paper does not contain any sections or figures explicitly labeled as "Pseudocode" or "Algorithm," nor does it present any structured, code-like blocks for its methods.
Open Source Code	Yes	Our code is available here.
Open Datasets	Yes	We run all methods on the MNIST [Le Cun et al., 2010] and CIFAR-10 [Krizhevsky et al., 2014] datasets.
Dataset Splits	No	The paper mentions using MNIST and CIFAR-10 datasets and their test sets but does not explicitly provide details about a validation dataset split or how it was constructed or used for hyperparameter tuning.
Hardware Specification	Yes	Even on the small architectures that we consider, the experiments take multiple hours on 8 NVIDIA Tesla V-100 GPUs or 8-core TPU-V3 devices [Jouppi et al., 2020].
Software Dependencies	No	The paper mentions algorithms and frameworks like "stochastic gradient descent (SGD)", "HMC", and "Re LU activations", but it does not specify any software dependencies with version numbers (e.g., "PyTorch 1.9", "TensorFlow 2.x").
Experiment Setup	Yes	On both the CIFAR-10 and MNIST datasets we use a small convolutional network (CNN) inspired by Le Net-5 [Le Cun et al., 1998], with 2 convolutional layers followed by 3 fully-connected layers. On MNIST we additionally consider a fully-connected neural network (MLP) with 2 hidden layers of 256 neurons each. For all BNN models, we run a single chain of HMC for 100 iterations discarding the ﬁrst 10 iterations as burn-in, following Izmailov et al. [2021]. In each case, we apply the Emp Cov prior to the ﬁrst layer, and a Gaussian prior to all other layers.