Radial and Directional Posteriors for Bayesian Deep Learning

Authors: Changyong Oh, Kamil Adamczewski, Mijung Park5298-5305

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section we provide empirical evidence supporting RDP s strengths. In all experiments, we use Adam optimizer (Kingma and Ba 2014) with Pytorch default setting. In all tasks, double grouping is used. Our code is on Git Hub2. Regression using UCI data We compare the predictive performance on regression tasks UCI dataset tested in (Gal and Ghahramani 2016; Louizos and Welling 2016) following the experimental settings from (Hern andez-Lobato and Adams 2015). We split the datasets 90%/10% between training and test data.3 To model noise variance (or precision), we use Gamma prior τ, p(τ) = G(a0 = 6, b0 = 6) and posterior q(τ) = G(a1, b1). We optimize for a1, b1 along with all the other variational parameters. The architecture used is ninput-50-1 except for the Protein and Year datasets where 100 hidden neurons are used. In the case of the second layer with the output dimension being one, we use fully factorized Gaussian and RDP with double grouping is only applied to the first layer. This is just enough to see improvement over mean-field based BNNs, such as Variational Inference (VI) (Graves 2011), Probabilistic Back- Propagation(PBP) (Hern andez-Lobato and Adams 2015), Dropout (Gal and Ghahramani 2016). Compared to another dependency aware posterior, Variational Matrix Gaussian (VMG) (Louizos and Welling 2016), 6 out of 10 dataset, RDP shows better test log-likelihood(LL) than others.
Researcher Affiliation Academia Changyong Oh, 1 Kamil Adamczewski,1,2 Mijung Park1,3 1Empirical Inference Department, Max-Planck-Institute for Intelligent Systems 2Max Planck ETH Center for Learning System 3Department of Computer Science, University of T ubingen {changyong.om, mijung.park}@tuebingen.mpg.de, kamil.m.adamczewski@gmail.com
Pseudocode No The paper describes algorithms and methods in textual form but does not include any explicitly labeled
Open Source Code Yes Our code is on Git Hub2.
Open Datasets Yes Regression using UCI data We compare the predictive performance on regression tasks UCI dataset tested in (Gal and Ghahramani 2016; Louizos and Welling 2016) following the experimental settings from (Hern andez-Lobato and Adams 2015). We split the datasets 90%/10% between training and test data.3
Dataset Splits No We split the datasets 90%/10% between training and test data.3
Hardware Specification No The paper mentions that variational inference can
Software Dependencies No In all experiments, we use Adam optimizer (Kingma and Ba 2014) with Pytorch default setting.
Experiment Setup Yes In all experiments, we use Adam optimizer (Kingma and Ba 2014) with Pytorch default setting.