Theory and Evaluation Metrics for Learning Disentangled Representations
Authors: Kien Do, Truyen Tran
ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through a comprehensive suite of experiments, we show that our metrics correctly characterize the representations learned by different methods and are consistent with qualitative (visual) results. Thus, the metrics allow disentanglement learning methods to be compared on a fair ground. We also empirically uncovered new interesting properties of VAE-based methods and interpreted them with our formulation. |
| Researcher Affiliation | Academia | Kien Do and Truyen Tran Applied AI Institute, Deakin University, Geelong, Australia {dkdo,truyen.tran}@deakin.edu.au |
| Pseudocode | No | The paper describes algorithms and formulations but does not include structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain an explicit statement about the release of source code or a link to a code repository for the described methodology. |
| Open Datasets | Yes | We use our proposed metrics to evaluate three representation learning methods namely Factor VAE (Kim & Mnih, 2018), β-VAE (Higgins et al., 2017a) and AAE (Makhzani et al., 2015) on both real and toy datasets which are Celeb A (Liu et al., 2015) and d Sprites (Matthey et al., 2017), respectively. |
| Dataset Splits | Yes | The Celeb A dataset (Liu et al., 2015) consists of more than 200 thousands face images with 40 binary attributes. We resize these images to 64 64. The d Sprites dataset (Matthey et al., 2017) is a toy dataset generated from 5 different factors of variation which are shape (3 values), scale (6 values), rotation (40 values), x-position (32 values), y-position (32 values). Statistics of these datasets are provided in Table 2. Dataset #Train #Test Image size Celeb A 162,770 19,962 64 64 3 d Sprites 737,280 0 64 64 1 |
| Hardware Specification | No | The paper describes the models, datasets, and training settings, but does not explicitly mention any specific hardware used for the experiments (e.g., GPU models, CPU types). |
| Software Dependencies | No | The paper mentions using Adam (Kingma & Ba, 2014) optimizer, but it does not specify version numbers for any programming languages or libraries used for implementation (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | We trained the models for 300 epochs with mini-batches of size 64. The learning rate is 10 3 for the encoder/decoder and is 10 4 for the discriminator over z. We used Adam (Kingma & Ba, 2014) optimizer with β1 = 0.5 and β2 = 0.99. Unless explicitly mentioned, we use the following default settings: i) for Celeb A: the number of latent variables is 65, the TC coefficient in Factor VAE is 50, the value for β in β-VAE is 50, and the coefficient for the generator loss over z in AAE is 50; ii) for d Sprites: the number of latent variables is 10. |