Demystifying Inter-Class Disentanglement

Authors: Aviv Gabbay, Yedid Hoshen

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In extensive experiments, our method is shown to achieve better disentanglement performance than both adversarial and non-adversarial methods that use the same level of supervision.
Researcher Affiliation Academia Aviv Gabbay Yedid Hoshen School of Computer Science and Engineering The Hebrew University of Jerusalem, Israel
Pseudocode Yes Algorithm 1: Style clustering
Open Source Code No The paper provides a 'Project webpage: http://www.vision.huji.ac.il/lord' but does not explicitly state that source code for the described method is available there or elsewhere.
Open Datasets Yes Datasets: We evaluate the performance of our method and the baselines on several datasets (each with the appropriate class labels): Cars3D (car model as class label, azimuth and elevation as content), Small Norb (object type lighting elevation as class labels, azimuth as content), Small Norb Poses (object type lighting as class labels, azimuth and elevation as content), Celeb A (person identity as class label, other unlabeled transitory facial attributes e.g. head pose and expression as content), KTH (person identity as class label, other unlabeled transitory attributes e.g skeleton position as content), Ra FD (facial expression as class label, rest as varied content). A more detailed description of each dataset and configuration can be found in the Appendix A.2. ... Cars3D (Reed et al., 2015) ... Small Norb (Le Cun et al., 2004) ... Celeb A (Liu et al., 2015) ... KTH (Laptev et al., 2004) ... Ra FD (Langner et al., 2010) ... Edges2Shoes (Yu & Grauman, 2014) ... Anime (Mckinsey, 2019)
Dataset Splits No The paper describes train and test splits for various datasets (e.g., '163 car models for training and the other 20 are held out for testing' for Cars3D, and 'holding out 10% of the images for testing' for Small Norb Poses, KTH, and Ra FD), but no explicit validation splits are mentioned.
Hardware Specification No No specific hardware details such as GPU/CPU models, memory, or cloud computing instances are provided.
Software Dependencies No The paper mentions software components and methods like 'ADAM method' and 'SGD' but does not specify software dependencies with version numbers (e.g., 'PyTorch 1.9' or 'TensorFlow 2.x').
Experiment Setup Yes The architecture of the generator consists of 3 fully-connected layers followed by 6 convolutional layers (the first 4 of them are preceded by an upsampling layer and followed by Ada IN normalization). We set the size of the content latent code to 128 and the size of the class code to 256 in all our experiments. We regularize the content embeddings with an additive gaussian noise with µ = 0 and σ = 1 and an activation decay with λ = 0.001. We perform the latent optimization using SGD utilizing the ADAM method for 200 epochs, with learning rate of 0.0001 for the generator and 0.001 for the latent codes. For each mini-batch, we update the parameters of the generator and the latent codes with a single gradient step each. For the second stage, the class and content encoders are CNNs with 5 convolutional layers and 3 fully-connected layers.