Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Isolating Sources of Disentanglement in Variational Autoencoders

Authors: Ricky T. Q. Chen, Xuechen Li, Roger B. Grosse, David K. Duvenaud

NeurIPS 2018 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We perform extensive quantitative and qualitative experiments, in both restricted and non-restricted settings, and show a strong relation between total correlation and disentanglement, when the model is trained using our framework.
Researcher Affiliation	Academia	Ricky T. Q. Chen, Xuechen Li, Roger Grosse, David Duvenaud University of Toronto, Vector Institute
Pseudocode	No	The paper does not contain any clearly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	Code is available at .
Open Datasets	Yes	We perform quantitative evaluations with two datasets, a dataset of 2D shapes [39] and a dataset of synthetic 3D faces [40]... [39] refers to: dsprites: Disentanglement testing sprites dataset. https://github.com/deepmind/dsprites-dataset/, 2017.
Dataset Splits	No	The paper mentions using datasets for experiments but does not provide specific details on how these datasets were split into training, validation, or test sets (e.g., percentages or sample counts).
Hardware Specification	No	The paper does not provide any specific details about the hardware (e.g., GPU models, CPU types, memory) used to conduct the experiments.
Software Dependencies	No	The paper does not list specific software dependencies with version numbers (e.g., Python 3.x, TensorFlow 1.x, PyTorch 1.x) that were used for the experiments.
Experiment Setup	Yes	We used β = 4 for β-VAE and β = 6 for β-TCVAE, based on modes in Figure 2. For Info GAN, we used 5 continuous latent codes and 5 noise variables. Other settings are chosen following those suggested by [6], but we also added instance noise [41] to stabilize training. ... we tuned β [1, 80] and used double the number of iterations for Factor VAE. Note that while β-VAE, Factor VAE and β-TCVAE use a fully connected architecture for the d Sprites dataset, Info GAN uses a convolutional architecture for increased stability.