Evaluating Disentanglement of Structured Representations

Authors: Raphaël Dang-Nhu

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimentally, we demonstrate that viewing object compositionality as a disentanglement problem addresses several issues with prior visual metrics of object separation. As a core technical component, we present the first representation probing algorithm handling slot permutation invariance.
Researcher Affiliation Academia Raphaël Dang-Nhu. This work constitutes the public version of Raphaël Dang-Nhu s Master Thesis at ETH Zürich.
Pseudocode Yes Algorithm 1 Permutation-invariant representation probing
Open Source Code No The paper mentions using public PyTorch implementations of existing architectures (MONet, GENESIS) and a third-party implementation for IODINE, but it does not state that its own novel metric or probing algorithm code is open-source or provide a link to it.
Open Datasets Yes We evaluate all models on CLEVR6 (Johnson et al., 2017) and Multi-d Sprites (Matthey et al., 2017; Burgess et al., 2019), with the exception of IODINE that we restricted to Multi-d Sprites for computational reasons, as CLEVR6 requires a week on 8 V100 GPUs per training.
Dataset Splits Yes Each group has 5000 samples, with a 4000/500/500 split for fitting, validation and evaluation of the factor predictor.
Hardware Specification Yes CLEVR6 requires a week on 8 V100 GPUs per training. Models were trained with one to four V100 GPUs. This work was granted access to the HPC resources of IDRIS under the allocation 2020-AD011012138 made by GENCI.
Software Dependencies No The paper mentions using 'public Pytorch implementations' and references third-party implementations for models, but it does not specify version numbers for PyTorch or any other software dependencies (libraries, frameworks, or specific tools) needed to replicate the experiments.
Experiment Setup Yes We train MONet exactly as in Burgess et al. (2019), except that we set σfg = 0.1 and σbg = 0.06 which was shown to yield better results in (Greff et al., 2019). [...] For the temporary predictors in Algorithm 1 (inside the loop), we use a linear model with Ridge regularization. For the final predictor, we use a random forest with 10 trees, and a maximum depth of 15. [...] We train all models for 200 epochs.