Hyperprior Induced Unsupervised Disentanglement of Latent Representations

Authors: Abdul Fatir Ansari, Harold Soh3175-3182

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experimental results on a range of datasets (2DShapes, 3DChairs, 3DFaces and Celeb A) show our approach to outperform the β-VAE and is competitive with the state-of-the-art Factor VAE. Our approach achieves significantly better disentanglement and reconstruction on a new dataset (Correlated Ellipses) which introduces correlations between the factors of variation.
Researcher Affiliation Academia Abdul Fatir Ansari, Harold Soh Department of Computer Science, National University of Singapore {afatir, harold}@comp.nus.edu.sg
Pseudocode No The paper does not contain a pseudocode block or an explicitly labeled algorithm section.
Open Source Code Yes details are available in the supplementary material and our code base is available for download at https://github.com/crslab/CHy VAE.
Open Datasets Yes 2DShapes (or d Sprites) (Matthey et al. 2017): 737,280 binary 64 64 images of 2D shapes... 3DFaces (Paysan et al. 2009): 239,840 greyscale 64 64 images of 3D Faces. 3DChairs (Aubry et al. 2014): 86,366 RGB 64 64 images of CAD chair models. Celeb A (Liu et al. 2015): 202,599 RGB images of celebrity faces center-cropped to dimensions 64 64.
Dataset Splits No The paper does not provide specific details regarding dataset splits for training, validation, or testing, nor does it cite a standard split used for these datasets.
Hardware Specification No The paper does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments.
Software Dependencies No The paper mentions using 'Adam optimizer' and a 'convolutional neural network (CNN) for the encoder and a deconvolutional NN for the decoder' but does not specify software dependencies with version numbers (e.g., Python, TensorFlow, PyTorch versions).
Experiment Setup Yes To ease comparisons between the methods and prior work, we use the same network architecture across all the compared methods. Specifically, we follow the model in (Kim and Mnih 2018): a convolutional neural network (CNN) for the encoder and a deconvolutional NN for the decoder. We normalize all datasets to [0, 1] and use sigmoid cross-entropy as the reconstruction loss function. For training, we use Adam optimizer (Kingma and Ba 2014) with a learning rate of 10 4. For the discriminator in Factor VAE, we use the parameters recommended by Kim and Mnih (2018).