Tripod: Three Complementary Inductive Biases for Disentangled Representation Learning

Authors: Kyle Hsu, Jubayer Ibn Hamid, Kaylee Burns, Chelsea Finn, Jiajun Wu

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We benchmark on four established image datasets with ground-truth source labels that facilitate quantitative evaluation
Researcher Affiliation Academia 1Stanford University. Correspondence to: Kyle Hsu <kylehsu@cs.stanford.edu>.
Pseudocode Yes Algorithm 1 Pseudocode for the Tripod objective.
Open Source Code Yes Code is available at https://github.com/ kylehkhsu/tripod.
Open Datasets Yes We benchmark on four established image datasets with ground-truth source labels that facilitate quantitative evaluation: Shapes3D (Burgess & Kim, 2018), MPI3D (Gondal et al., 2019), Falcor3D (Nie, 2019), and Isaac3D (Nie, 2019).
Dataset Splits Yes We follow prior work in considering a statistical learning problem: we use the entire dataset for unsupervised training and evaluate on a subset of 10, 000 samples (Locatello et al., 2019).
Hardware Specification No The paper includes a 'Profiling Study' section that measures training iteration runtime, but does not provide specific hardware details such as GPU/CPU models, memory, or detailed computer specifications used for experiments.
Software Dependencies No The paper acknowledges developers of 'Num Py (Harris et al., 2020), JAX (Bradbury et al., 2018), Equinox (Kidger & Garcia, 2021), and scikit-learn (Pedregosa et al., 2011)', but does not provide specific version numbers for these software dependencies.
Experiment Setup Yes Table 4: Fixed hyperparameters for all autoencoder variants (e.g., number of latents nz, Adam W learning rate, batch size). Table 5: Key regularization hyperparameter tuning done for each autoencoder (e.g., β, weight decay, λvanilla Hessian penalty, λlatent multiinformation).