reproducibilityindex.ai

Posterior Refinement Improves Sample Efficiency in Bayesian Neural Networks

Authors: Agustinus Kristiadi, Runa Eschenhagen, Philipp Hennig

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this work, we experimentally show that the key to good MC-approximated predictive distributions is the quality of the approximate posterior itself. ... We validate the method via extensive experiments and show that reﬁned posteriors are competitive with the much more expensive full-batch Hamiltonian Monte Carlo.
Researcher Affiliation	Academia	Agustinus Kristiadi University of Tübingen agustinus.kristiadi@uni-tuebingen.de Runa Eschenhagen University of Tübingen runa.eschenhagen@uni-tuebingen.de Philipp Hennig University of Tübingen and MPI for Intelligent Systems, Tübingen philipp.hennig@uni-tuebingen.de
Pseudocode	No	The paper describes the proposed method but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks or figures.
Open Source Code	Yes	Code available at: https://github.com/runame/laplace-refinement.
Open Datasets	Yes	Datasets We validate our method using standard classiﬁcation datasets: Fashion-MNIST (FMNIST), CIFAR-10, and CIFAR-100.
Dataset Splits	Yes	These prior precisions are obtained via grid search on the respective HMC baseline, maximizing validation log-likelihood. ... For each in-distribution dataset, we randomly pick 5,000 samples for validation.
Hardware Specification	Yes	Using a standard consumer GPU (Nvidia RTX 2080Ti), each epoch a length-5 NF s optimization takes around 3.4 seconds.
Software Dependencies	No	The paper mentions software like Pyro [39] and second-order optimization libraries [23, 24], but does not provide specific version numbers for any of these software dependencies.
Experiment Setup	Yes	We use prior precisions of 5·10 and 40 for the last-layer F-MNIST and CIFAR experiments, respectively. ... For all methods, we use MC integration with 20 samples to obtain the predictive distribution, except for HMC and CSGHMC where we use S = 600 and S = 12, respectively. ... More implementation details are in Appendix B.