Efficient and Scalable Bayesian Neural Nets with Rank-1 Factors
Authors: Michael Dusenberry, Ghassen Jerfel, Yeming Wen, Yian Ma, Jasper Snoek, Katherine Heller, Balaji Lakshminarayanan, Dustin Tran
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We perform a systematic empirical study on the choices of prior, variational posterior, and methods to improve training. For Res Net-50 on Image Net, Wide Res Net 28-10 on CIFAR-10/100, and an RNN on MIMIC-III, rank-1 BNNs achieve state-of-the-art performance across log-likelihood, accuracy, and calibration on the test sets and outof-distribution variants. |
| Researcher Affiliation | Collaboration | 1Google Brain, Mountain View, USA 2Duke University, Durham, USA 3University of Toronto, Toronto, CA 4University of California, San Diego, USA. |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | 1 Code: https://github.com/google/edward2. |
| Open Datasets | Yes | For Res Net-50 on Image Net, Wide Res Net 28-10 on CIFAR-10/100, and an RNN on MIMIC-III, rank-1 BNNs achieve state-of-the-art performance across log-likelihood, accuracy, and calibration on the test sets and outof-distribution variants. MIMIC-III (Johnson et al., 2016). |
| Dataset Splits | No | The paper mentions 'Validation Test Method' for MIMIC-III but does not provide specific details on the dataset splits (percentages, counts, or methodology) for training, validation, or testing across any of the datasets used. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used to run the experiments, such as GPU models, CPU types, or memory configurations. |
| Software Dependencies | No | The paper mentions 'Edward2' as a code reference, but does not provide specific version numbers for Edward2 or any other software dependencies, libraries, or programming languages used. |
| Experiment Setup | Yes | For each, we tune over the total number of training epochs, and measure NLL, accuracy, and ECE on both the test set and CIFAR-10-C corruptions dataset. As the number of mixture components increases from 1 to 8, the performance across all metrics increases. At K = 16, however, there is a decline in performance. Based on our findings, all experiments in Section 4 use K = 4. |