reproducibilityindex.ai

Markov Chain Monte Carlo and Variational Inference: Bridging the Gap

Authors: Tim Salimans, Diederik Kingma, Max Welling

ICML 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We describe the theoretical foundations that make this possible and show some promising ﬁrst results. As a ﬁrst example we look at sampling from the bivariate Gaussian distribution... To demonstrate our Hamiltonian variational approximation algorithm we use an example from (Albert, 2009)... Next, we demonstrate the effectiveness of our Hamiltonian variational inference approach for learning deep generative neural network models. These models are ﬁtted to a binarized version of the MNIST dataset... See table 1 for our numerical results and a comparison to reported results with other methods.
Researcher Affiliation	Collaboration	Tim Salimans TIM@ALGORITMICA.NL Algoritmica Diederik P. Kingma and Max Welling [D.P.KINGMA,M.WELLING]@UVA.NL University of Amsterdam
Pseudocode	Yes	Algorithm 1 MCMC lower bound estimate; Algorithm 2 Markov Chain Variational Inference (MCVI); Algorithm 3 Hamiltonian variational inference (HVI); Algorithm 4 Sequential MCVI
Open Source Code	No	The paper does not provide any specific link or statement about open-sourcing the code for the described methodology.
Open Datasets	Yes	These models are ﬁtted to a binarized version of the MNIST dataset as e.g. used in (Uria et al., 2014).
Dataset Splits	Yes	Before ﬁtting our models to the full training set, the model hyper-parameters and number of training epochs were determined based on performance on a vali-dation set of about 15% of the available training data.
Hardware Specification	No	The paper mentions computational cost and the use of 'automatic differentiation packages' but does not specify any particular hardware components like CPU models, GPU models, or memory specifications used for the experiments.
Software Dependencies	No	The paper mentions Theano and Adam as software used: 'automatic differentiation package such as Theano (Bastien et al., 2012)' and 'Stochastic gradient-based optimization was performed using Adam (Kingma & Ba, 2014) with default hyperparameters.' However, no version numbers are provided for these software dependencies.
Experiment Setup	Yes	We choose q (z0), q (v01\|z0), r (v1\|z1) to all be multivariate Gaussian distributions with diagonal covariance matrix. The mass matrix M is also diagonal. The means of q (v01\|z0) and r (v1\|z1) are deﬁned as linear functions in z and rz log p(x, z), with adjustable coefﬁcients. The auxiliary inference model r(v\|x, z) is chosen to be a fully-connected neural network with one deterministic hidden layer with nh = 300 hidden units with softplus (log(1 + exp(x))) activations and a Gaussian output variable with diagonal covariance. The number of leapfrog steps was varied from 0 to 16. After broader model search with a validation set, we trained a ﬁnal model with 16 leapfrog steps and nh = 800. Stochastic gradient-based optimization was performed using Adam (Kingma & Ba, 2014) with default hyperparameters.