Explicit Disentanglement of Appearance and Perspective in Generative Models

Authors: Nicki Skafte, Søren Hauberg

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirically, our model separates the visual style from digit type on MNIST, separates shape and pose in images of human bodies and facial features from facial shape on Celeb A. experimentally, we demonstrate that our model on four datasets: standard disentanglement benchmark d Sprites, disentanglement of style and content on MNIST, pose and shape on images of human bodies (Fig. 2) and facial features and facial shape on Celeb A.
Researcher Affiliation Academia Nicki S. Detlefsen nsde@dtu.dk Søren Hauberg sohau@dtu.dk Section for Cognitive Systems, Technical University of Denmark
Pseudocode No The paper describes its models and procedures using prose and diagrams but does not include any structured pseudocode or algorithm blocks.
Open Source Code Yes The models were implemented in Pytorch [Paszke et al., 2017] and the code is available at https://github.com/SkafteNicki/unsuper/.
Open Datasets Yes We initially test our models on the d Sprites dataset [Matthey et al., 2017]... Secondly, we test our model on the MNIST dataset [Le Cun et al., 1998]... We now consider synthetic image data of human bodies generated by the Skinned Multi-Person Linear Model (SMPL) [Loper et al., 2015]... Finally, we qualitatively evaluated our proposed model on the Celeb A dataset [Liu et al., 2015].
Dataset Splits No For the SMPL dataset, the paper states 'We generate 10,000 bodies (8,000 for training, 2,000 for testing)', clearly defining training and testing sets but not mentioning a validation set. For other datasets like MNIST and dSprites, specific split information, including for validation, is not provided.
Hardware Specification No The paper mentions 'NVIDIA Corporation with the donation of GPU hardware used for this research', but it does not specify any particular GPU model (e.g., Tesla V100, RTX 3090) or other hardware components like CPU or memory details.
Software Dependencies No The paper states 'The models were implemented in Pytorch [Paszke et al., 2017]'. While Pytorch is a software component, the citation does not provide a specific version number for PyTorch itself, which is required for reproducibility.
Experiment Setup Yes For all experiments, we train a standard VAE, a β-VAE [Higgins et al., 2017], a β-TCVAE [Chen et al., 2018], a DIP-VAE-II [Kumar et al., 2017] and our developed VITAE model. ... For β-VAE... we use β = 8.0 based on qualitative evaluation of results. ... To make the task more difficult, we artificially augment the dataset by first randomly rotating each image by an angle uniformly chosen in the interval [ 20 , 20 ] and secondly translating the images by t = [x, y], where x, y is uniformly chosen from the interval [-3, 3].