Efficient and Scalable Bayesian Neural Nets with Rank-1 Factors

Authors: Michael Dusenberry, Ghassen Jerfel, Yeming Wen, Yian Ma, Jasper Snoek, Katherine Heller, Balaji Lakshminarayanan, Dustin Tran

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We perform a systematic empirical study on the choices of prior, variational posterior, and methods to improve training. For Res Net-50 on Image Net, Wide Res Net 28-10 on CIFAR-10/100, and an RNN on MIMIC-III, rank-1 BNNs achieve state-of-the-art performance across log-likelihood, accuracy, and calibration on the test sets and outof-distribution variants.
Researcher Affiliation Collaboration 1Google Brain, Mountain View, USA 2Duke University, Durham, USA 3University of Toronto, Toronto, CA 4University of California, San Diego, USA.
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes 1 Code: https://github.com/google/edward2.
Open Datasets Yes For Res Net-50 on Image Net, Wide Res Net 28-10 on CIFAR-10/100, and an RNN on MIMIC-III, rank-1 BNNs achieve state-of-the-art performance across log-likelihood, accuracy, and calibration on the test sets and outof-distribution variants. MIMIC-III (Johnson et al., 2016).
Dataset Splits No The paper mentions 'Validation Test Method' for MIMIC-III but does not provide specific details on the dataset splits (percentages, counts, or methodology) for training, validation, or testing across any of the datasets used.
Hardware Specification No The paper does not provide specific details about the hardware used to run the experiments, such as GPU models, CPU types, or memory configurations.
Software Dependencies No The paper mentions 'Edward2' as a code reference, but does not provide specific version numbers for Edward2 or any other software dependencies, libraries, or programming languages used.
Experiment Setup Yes For each, we tune over the total number of training epochs, and measure NLL, accuracy, and ECE on both the test set and CIFAR-10-C corruptions dataset. As the number of mixture components increases from 1 to 8, the performance across all metrics increases. At K = 16, however, there is a decline in performance. Based on our findings, all experiments in Section 4 use K = 4.