Hierarchical Variational Models
Authors: Rajesh Ranganath, Dustin Tran, David Blei
ICML 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We consider a simulated study on a two-dimensional discrete posterior; we also evaluate our proposed variational models on deep exponential families (Ranganath et al., 2015), a class of deep generative models which achieve state-of-the-art results on text analysis. In total, we train 2 variational models for the simulated study and 12 models over two datasets. Table 2. New York Times. Held-out perplexity (lower is better). Table 3. Science. Held-out perplexity (lower is better). |
| Researcher Affiliation | Academia | Rajesh Ranganath RAJESHR@CS.PRINCETON.EDU Princeton University, 35 Olden St., Princeton, NJ 08540 Dustin Tran DUSTIN@CS.COLUMBIA.EDU David M. Blei DAVID.BLEI@COLUMBIA.EDU Columbia University, 500 W 120th St., New York, NY 10027 |
| Pseudocode | Yes | Algorithm 1: Black box inference with an HVM |
| Open Source Code | Yes | An implementation of HVMs is available in Edward (Tran et al., 2016a), a Python library for probabilistic modeling. |
| Open Datasets | Yes | We consider two text corpora of news and scientific articles The New York Times (NYT) and Science. We compare to the mean-field approximation from Ranganath et al. (2015) which achieves state of the art results on text. |
| Dataset Splits | No | The paper does not provide specific dataset split information (exact percentages, sample counts, or detailed splitting methodology) for training, validation, and testing. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types, or memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper mentions 'Edward', 'Stan', and 'Theano' as software tools but does not provide specific version numbers for any software dependencies. |
| Experiment Setup | Yes | We parameterize each variational prior q(λzi) with a normalizing flow of length 2, and use the inverse flow of length 10 for r(λzi). We use planar transformations (Rezende and Mohamed, 2015). |