Deterministic Variational Inference for Robust Bayesian Neural Networks

Authors: Anqi Wu, Sebastian Nowozin, Edward Meeds, Richard E. Turner, José Miguel Hernández-Lobato, Alexander L. Gaunt

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We implement deterministic variational inference (DVI) as described above to train small Re LU networks on UCI regression datasets (Dheeru & Karra Taniskidou, 2017). The experiments address the claims that our methods for eliminating gradient variance and automatic tuning of the prior improve the performance of the final trained model.
Researcher Affiliation Collaboration 1 Princeton Neuroscience Institute, Princeton University 2 Google AI Berlin 3 Department of Engineering, University of Cambridge 4 Microsoft Research, Cambridge
Pseudocode No No explicit pseudocode or algorithm blocks were found in the paper.
Open Source Code Yes Our implementation in Tensor Flow is available at https://github.com/Microsoft/deterministic-variational-inference
Open Datasets Yes We implement deterministic variational inference (DVI) as described above to train small Re LU networks on UCI regression datasets (Dheeru & Karra Taniskidou, 2017).
Dataset Splits No Each dataset is split into random training and test sets with 90% and 10% of the data respectively. This splitting process is repeated 20 times and the average test performance of each method at convergence is reported in table 2
Hardware Specification Yes Figure 4 shows the time required to propagate activations through a single layer using the MCVI, DVI and d DVI methods on a Tesla V100 GPU.
Software Dependencies No Our implementation in Tensor Flow is available at https://github.com/Microsoft/deterministic-variational-inference (TensorFlow is mentioned, but no specific version number is provided.)
Experiment Setup Yes The same model is used for each inference method: a single hidden layer of 50 units for each dataset considered, extending this to 100 units in the special case of the larger protein structure dataset, prot.