reproducibilityindex.ai

Theory and Evaluation Metrics for Learning Disentangled Representations

Authors: Kien Do, Truyen Tran

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Through a comprehensive suite of experiments, we show that our metrics correctly characterize the representations learned by different methods and are consistent with qualitative (visual) results. Thus, the metrics allow disentanglement learning methods to be compared on a fair ground. We also empirically uncovered new interesting properties of VAE-based methods and interpreted them with our formulation.
Researcher Affiliation	Academia	Kien Do and Truyen Tran Applied AI Institute, Deakin University, Geelong, Australia {dkdo,truyen.tran}@deakin.edu.au
Pseudocode	No	The paper describes algorithms and formulations but does not include structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not contain an explicit statement about the release of source code or a link to a code repository for the described methodology.
Open Datasets	Yes	We use our proposed metrics to evaluate three representation learning methods namely Factor VAE (Kim & Mnih, 2018), β-VAE (Higgins et al., 2017a) and AAE (Makhzani et al., 2015) on both real and toy datasets which are Celeb A (Liu et al., 2015) and d Sprites (Matthey et al., 2017), respectively.
Dataset Splits	Yes	The Celeb A dataset (Liu et al., 2015) consists of more than 200 thousands face images with 40 binary attributes. We resize these images to 64 64. The d Sprites dataset (Matthey et al., 2017) is a toy dataset generated from 5 different factors of variation which are shape (3 values), scale (6 values), rotation (40 values), x-position (32 values), y-position (32 values). Statistics of these datasets are provided in Table 2. Dataset #Train #Test Image size Celeb A 162,770 19,962 64 64 3 d Sprites 737,280 0 64 64 1
Hardware Specification	No	The paper describes the models, datasets, and training settings, but does not explicitly mention any specific hardware used for the experiments (e.g., GPU models, CPU types).
Software Dependencies	No	The paper mentions using Adam (Kingma & Ba, 2014) optimizer, but it does not specify version numbers for any programming languages or libraries used for implementation (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup	Yes	We trained the models for 300 epochs with mini-batches of size 64. The learning rate is 10 3 for the encoder/decoder and is 10 4 for the discriminator over z. We used Adam (Kingma & Ba, 2014) optimizer with β1 = 0.5 and β2 = 0.99. Unless explicitly mentioned, we use the following default settings: i) for Celeb A: the number of latent variables is 65, the TC coefﬁcient in Factor VAE is 50, the value for β in β-VAE is 50, and the coefﬁcient for the generator loss over z in AAE is 50; ii) for d Sprites: the number of latent variables is 10.