Hierarchical Variational Models

Authors: Rajesh Ranganath, Dustin Tran, David Blei

ICML 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We consider a simulated study on a two-dimensional discrete posterior; we also evaluate our proposed variational models on deep exponential families (Ranganath et al., 2015), a class of deep generative models which achieve state-of-the-art results on text analysis. In total, we train 2 variational models for the simulated study and 12 models over two datasets. Table 2. New York Times. Held-out perplexity (lower is better). Table 3. Science. Held-out perplexity (lower is better).
Researcher Affiliation Academia Rajesh Ranganath RAJESHR@CS.PRINCETON.EDU Princeton University, 35 Olden St., Princeton, NJ 08540 Dustin Tran DUSTIN@CS.COLUMBIA.EDU David M. Blei DAVID.BLEI@COLUMBIA.EDU Columbia University, 500 W 120th St., New York, NY 10027
Pseudocode Yes Algorithm 1: Black box inference with an HVM
Open Source Code Yes An implementation of HVMs is available in Edward (Tran et al., 2016a), a Python library for probabilistic modeling.
Open Datasets Yes We consider two text corpora of news and scientific articles The New York Times (NYT) and Science. We compare to the mean-field approximation from Ranganath et al. (2015) which achieves state of the art results on text.
Dataset Splits No The paper does not provide specific dataset split information (exact percentages, sample counts, or detailed splitting methodology) for training, validation, and testing.
Hardware Specification No The paper does not provide specific hardware details (exact GPU/CPU models, processor types, or memory amounts) used for running its experiments.
Software Dependencies No The paper mentions 'Edward', 'Stan', and 'Theano' as software tools but does not provide specific version numbers for any software dependencies.
Experiment Setup Yes We parameterize each variational prior q(λzi) with a normalizing flow of length 2, and use the inverse flow of length 10 for r(λzi). We use planar transformations (Rezende and Mohamed, 2015).