FUNCTIONAL VARIATIONAL BAYESIAN NEURAL NETWORKS
Authors: Shengyang Sun, Guodong Zhang, Jiaxin Shi, Roger Grosse
ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirically, we find f BNNs extrapolate well using various structured priors, provide reliable uncertainty estimates, and scale to large datasets. |
| Researcher Affiliation | Academia | University of Toronto, Vector Institute, Tsinghua University {ssy, gdzhang, rgrosse}@cs.toronto.edu, shijx15@mails.tsinghua.edu.cn |
| Pseudocode | Yes | Algorithm 1 Functional Variational Bayesian Neural Networks (f BNNs) |
| Open Source Code | No | The paper does not provide any explicit statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | We then experimented with standard regression benchmark datasets from the UCI collection (Asuncion & Newman, 2007). |
| Dataset Splits | Yes | We randomly split the datasets into 80% training, 10% validation, and 10% test. We used the validating set to select the hyperparameters and performed early stopping. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used to run the experiments, such as GPU or CPU models, or specific cloud instance types. |
| Software Dependencies | No | The paper does not list specific software dependencies with version numbers for its implementation (e.g., Python 3.x, PyTorch 1.x). |
| Experiment Setup | Yes | In all of our experiments, the variational posterior is represented as a stochastic neural network with independent Gaussian distributions over the weights, i.e. q(w) = N(w; µ, diag(σ2)). We always used the Re LU activation function unless otherwise specified. ... For all datasets, we used networks with one hidden layer of 50 hidden units. ... For large scale datasets, Both methods were trained for 80,000 iterations. We used 1 hidden layer with 100 hidden units for all datasets. ... In each iteration, measurement sets consist of 500 training samples and 5 or 50 points from the sampling distribution c, tuned by validation performance. |