reproducibilityindex.ai

Auto-Encoding Variational Bayes

Authors: Diederik P. Kingma; Max Welling

ICLR 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We trained generative models of images from the MNIST and Frey Face datasets3 and compared learning algorithms in terms of the variational lower bound, and the estimated marginal likelihood.
Researcher Affiliation	Academia	Diederik P. Kingma Machine Learning Group Universiteit van Amsterdam dpkingma@gmail.com Max Welling Machine Learning Group Universiteit van Amsterdam welling.max@gmail.com
Pseudocode	Yes	Algorithm 1 Minibatch version of the Auto-Encoding VB (AEVB) algorithm.
Open Source Code	No	The paper does not provide an explicit statement or link to open-source code for the described methodology.
Open Datasets	Yes	We trained generative models of images from the MNIST and Frey Face datasets3 and compared learning algorithms in terms of the variational lower bound, and the estimated marginal likelihood. [...] 3Available at http://www.cs.nyu.edu/ roweis/data.html
Dataset Splits	No	The paper mentions training and test sets but does not specify a separate validation split or how it was used for hyperparameter tuning. 'Stepsizes were adapted with Adagrad [DHS10]; the Adagrad global stepsize parameters were chosen from {0.01, 0.02, 0.1} based on performance on the training set in the ﬁrst few iterations.'
Hardware Specification	Yes	Computation took around 20-40 minutes per million training samples with a Intel Xeon CPU running at an effective 40 GFLOPS.
Software Dependencies	No	The paper mentions optimization methods like SGD and Adagrad but does not specify software libraries with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup	Yes	Parameters are updated using stochastic gradient ascent where gradients are computed by differentiating the lower bound estimator θ,φL(θ, φ; X) (see algorithm 1), plus a small weight decay term corresponding to a prior p(θ) = N(0, I). [...] Stepsizes were adapted with Adagrad [DHS10]; the Adagrad global stepsize parameters were chosen from {0.01, 0.02, 0.1} based on performance on the training set in the ﬁrst few iterations. Minibatches of size M = 100 were used, with L = 1 samples per datapoint.