reproducibilityindex.ai

Automatic Variational Inference in Stan

Authors: Alp Kucukelbir, Rajesh Ranganath, Andrew Gelman, David Blei

NeurIPS 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We compare advi to mcmc sampling across hierarchical generalized linear models, nonconjugate matrix factorization, and a mixture model. We train the mixture model on a quarter million images.
Researcher Affiliation	Academia	Alp Kucukelbir Columbia University alp@cs.columbia.edu Rajesh Ranganath Princeton University rajeshr@cs.princeton.edu Andrew Gelman Columbia University gelman@stat.columbia.edu David M. Blei Columbia University david.blei@columbia.edu
Pseudocode	Yes	Algorithm 1: Automatic diﬀerentiation variational inference (advi)
Open Source Code	Yes	We propose an automatic variational inference algorithm, automatic diﬀerentiation variational inference (advi); we implement it in Stan (code available), a probabilistic programming system.
Open Datasets	Yes	Here, we show how easy it is to explore new models using advi. In both models, we use the Frey Face dataset, which contains 1956 frames (28 20 pixels) of facial expressions extracted from a video sequence. We explore the imageclef dataset, which has 250 000 images [25].
Dataset Splits	No	The paper mentions training sets and held-out/evaluation sets (e.g., 'We use 10 000 training samples and hold out 1000 for testing'), but it does not explicitly define or use a separate 'validation' split.
Hardware Specification	No	The paper does not provide specific details about the hardware (e.g., CPU, GPU models, memory) used for running the experiments.
Software Dependencies	Yes	advi is available in Stan 2.8. See Appendix C.
Experiment Setup	Yes	We approximate the posterior predictive likelihood using a mc estimate. For mcmc, we plug in posterior samples. For advi, we draw samples from the posterior approximation during the optimization. We initialize advi with a draw from a standard Gaussian. We study advi with two settings of M, the number of mc samples used to estimate gradients. A single sample per iteration is suﬃcient; it is also the fastest. (We set M D 1 from here on.)