reproducibilityindex.ai

On the Transfer of Disentangled Representations in Realistic Settings

Authors: Andrea Dittadi, Frederik Träuble, Francesco Locatello, Manuel Wuthrich, Vaibhav Agrawal, Ole Winther, Stefan Bauer, Bernhard Schölkopf

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We propose new architectures in order to scale disentangled representation learning to realistic high-resolution settings and conduct a large-scale empirical study of disentangled representations on this dataset.
Researcher Affiliation	Academia	1Technical University of Denmark, 2Max Planck Institute for Intelligent Systems, 3ETH Zurich, Department for Computer Science, 4Copenhagen University Hospital, 5University of Copenhagen , 6CIFAR Azrieli Global Scholar
Pseudocode	No	The paper does not contain any clearly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper states that the datasets are made publicly available with a URL, but it does not provide an explicit statement or link for the source code of the methodology described in the paper.
Open Datasets	Yes	We propose a dataset consisting of simulated observations from a scene where a robotic arm interacts with a cube in a stage (see Fig. 1). ... Additionally, we recorded an annotated dataset under the same conditions in the real-world setup: we acquired 1,809 camera images from the same viewpoint and recorded the labels of the 7 underlying factors of variation. ... These datasets are made publicly available.1 http://people.tuebingen.mpg.de/ei-datasets/iclr_transfer_paper/robot_ finger_datasets.tar (6.18 GB)
Dataset Splits	No	The paper specifies training and testing set sizes for downstream tasks (10k and 5k images respectively) but does not explicitly describe a separate dataset split for validation of the main models or for general model development. It mentions 'validation' in the context of evaluation metrics but not as a dataset partition.
Hardware Specification	Yes	Training these models requires approximately 2.8 GPU years on NVIDIA Tesla V100 PCIe.
Software Dependencies	No	The paper mentions using the 'Adam optimizer' (Kingma & Ba, 2014) but does not provide specific version numbers for any software, libraries, or programming languages used in the implementation or experiments (e.g., Python version, PyTorch/TensorFlow version).
Experiment Setup	Yes	The hyperparameter sweep is deﬁned as follows: We train the models using either unsupervised learning or weakly supervised learning (Locatello et al., 2020). ... We vary the parameter β in {1, 2, 4}, and use linear deterministic warm-up (Bowman et al., 2015; Sønderby et al., 2016) over the ﬁrst {0, 10000, 50000} training steps. The latent space dimensionality is in {10, 25, 50}. Half of the models are trained with additive noise in the input image. ... Each of the 108 resulting conﬁgurations is trained with 10 random seeds. ... We use a batch size of 64 and train for 400k steps. The learning rate is initialized to 1e-4 and halved at 150k and 300k training steps. We clip the global gradient norm to 1.0 before each weight update.