reproducibilityindex.ai

Fixing a Broken ELBO

Authors: Alexander Alemi, Ben Poole, Ian Fischer, Joshua Dillon, Rif A. Saurous, Kevin Murphy

ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In addition to our unifying theoretical framework, we empirically study the performance of a variety of different VAE models with both simple and complex encoders, decoders, and priors on several simple image datasets in terms of the RD curve. We show that VAEs with powerful autoregressive decoders can be trained to not ignore their latent code by targeting certain points on this curve. We also show how it is possible to recover the true generative process (up to reparameterization) of a simple model on a synthetic dataset with no prior knowledge except for the true value of the mutual information I (derived from the true generative model). We believe that information constraints provide an interesting alternative way to regularize the learning of latent variable models.
Researcher Affiliation	Collaboration	1Google AI 2Stanford University. Correspondence to: Alexander A. Alemi <alemi@google.com>.
Pseudocode	No	The paper does not contain any pseudocode or algorithm blocks.
Open Source Code	No	The paper does not contain any explicit statement or link indicating the availability of open-source code for the described methodology.
Open Datasets	Yes	We use the static binary MNIST dataset from Larochelle & Murray (2011).
Dataset Splits	No	The paper refers to using the 'static binary MNIST dataset' from a citation but does not explicitly state the train/validation/test split percentages, sample counts, or specific methodology for data partitioning.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory specifications) used for running the experiments.
Software Dependencies	No	The paper does not mention any specific software dependencies or libraries with version numbers.
Experiment Setup	Yes	We train them all to minimize the β-VAE objective in Equation 6. Full details can be found in Appendix F. Runs were performed at various values of β ranging from 0.1 to 10.0, both with and without KL annealing (Bowman et al., 2016).