FUNCTIONAL VARIATIONAL BAYESIAN NEURAL NETWORKS

Authors: Shengyang Sun, Guodong Zhang, Jiaxin Shi, Roger Grosse

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirically, we find f BNNs extrapolate well using various structured priors, provide reliable uncertainty estimates, and scale to large datasets.
Researcher Affiliation Academia University of Toronto, Vector Institute, Tsinghua University {ssy, gdzhang, rgrosse}@cs.toronto.edu, shijx15@mails.tsinghua.edu.cn
Pseudocode Yes Algorithm 1 Functional Variational Bayesian Neural Networks (f BNNs)
Open Source Code No The paper does not provide any explicit statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets Yes We then experimented with standard regression benchmark datasets from the UCI collection (Asuncion & Newman, 2007).
Dataset Splits Yes We randomly split the datasets into 80% training, 10% validation, and 10% test. We used the validating set to select the hyperparameters and performed early stopping.
Hardware Specification No The paper does not provide specific details about the hardware used to run the experiments, such as GPU or CPU models, or specific cloud instance types.
Software Dependencies No The paper does not list specific software dependencies with version numbers for its implementation (e.g., Python 3.x, PyTorch 1.x).
Experiment Setup Yes In all of our experiments, the variational posterior is represented as a stochastic neural network with independent Gaussian distributions over the weights, i.e. q(w) = N(w; µ, diag(σ2)). We always used the Re LU activation function unless otherwise specified. ... For all datasets, we used networks with one hidden layer of 50 hidden units. ... For large scale datasets, Both methods were trained for 80,000 iterations. We used 1 hidden layer with 100 hidden units for all datasets. ... In each iteration, measurement sets consist of 500 training samples and 5 or 50 points from the sampling distribution c, tuned by validation performance.