reproducibilityindex.ai

Bayesian Layers: A Module for Neural Network Uncertainty

Authors: Dustin Tran, Mike Dusenberry, Mark van der Wilk, Danijar Hafner

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	As demonstration, we ﬁt a 5-billion parameter Bayesian Transformer on 512 TPUv2 cores for uncertainty in machine translation and a Bayesian dynamics model for model-based planning.
Researcher Affiliation	Industry	Dustin Tran Google Brain Michael W. Dusenberry Google Brain Mark van der Wilk Prowler.io Danijar Hafner Google Brain
Pseudocode	No	The paper includes code snippets in various figures (e.g., Figure 1, 3, 4, 5, 6, 7, 8) but no structured pseudocode or algorithm blocks.
Open Source Code	Yes	All code is available at https://github.com/google/edward2 as part of the edward2 namespace.
Open Datasets	Yes	We implemented a Bayesian Transformer for the WMT14 EN-FR translation task.
Dataset Splits	No	The paper references datasets (WMT14 EN-FR translation task, 'cheetah task' for RL) but does not provide specific training/test/validation split percentages or sample counts.
Hardware Specification	Yes	As demonstration, we ﬁt a 5-billion parameter Bayesian Transformer on 512 TPUv2 cores for uncertainty in machine translation and a Bayesian dynamics model for model-based planning.
Software Dependencies	Yes	Code snippets assume import edward2 as ed; import tensorflow as tf; tensorflow==2.0.0.
Experiment Setup	No	The paper mentions training times and memory usage for models (e.g., 'Training time for the deterministic Transformer takes roughly 13 hours; the Bayesian Transformer takes 16 hours and 2 extra gb per TPU.'), but it does not provide specific hyperparameters such as learning rate, batch size, or optimizer settings for reproduction.