Perturbative Black Box Variational Inference
Authors: Robert Bamler, Cheng Zhang, Manfred Opper, Stephan Mandt
NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate PBBVI with different models. First we investigate its behavior in a controlled setup of Gaussian processes on synthetic data (Section 4.1). We then evaluate PBBVI based on a classification task using Gaussian processes classifiers, where we use data from the UCI machine learning repository (Section 4.2). Finally, we use an experiment with the variational autoencoder (VAE) to explore our approach on a deep generative model (Section 4.3). This experiment is carried out on MNIST data. |
| Researcher Affiliation | Collaboration | Robert Bamler Disney Research Pittsburgh, USA Cheng Zhang Disney Research Pittsburgh, USA Manfred Opper TU Berlin Berlin, Germany Stephan Mandt Disney Research Pittsburgh, USA firstname.lastname@{disneyresearch.com, tu-berlin.de} |
| Pseudocode | Yes | Algorithm 1: Perturbative Black Box Variational Inference (PBBVI) |
| Open Source Code | No | The paper does not provide any explicit statements or links indicating that source code for the described methodology is publicly available. |
| Open Datasets | Yes | We use four data sets from the UCI machine learning repository, suitable for binary classification: Crab (200 datapoints), Pima (768 datapoints), Heart (270 datapoints), and Sonar (208 datapoints). ... We train the VAE on the MNIST data set of handwritten digits (Le Cun et al., 1998). |
| Dataset Splits | Yes | We split the data set into equally sized training, validation, and test sets. We then tune the learning rate and the number of Monte Carlo samples per gradient step to obtain optimal performance on the validation set after minimizing the alpha-divergence with a fixed budget of random samples. |
| Hardware Specification | No | The paper does not provide any specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions building on a publicly available implementation but does not specify any software names with version numbers (e.g., Python, TensorFlow, PyTorch versions). |
| Experiment Setup | Yes | We set the hyper parameters s = 1 and l = D/2 throughout all experiments, where D is the dimensionality of input x. ... The model has 100 latent units in the first stochastic layer and 50 latent units in the second stochastic layer. ... We use the same training schedules as in the publicly available implementation, keeping the total number of training iterations independent of the size of the training set. |