Evaluating Disentanglement of Structured Representations
Authors: Raphaël Dang-Nhu
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimentally, we demonstrate that viewing object compositionality as a disentanglement problem addresses several issues with prior visual metrics of object separation. As a core technical component, we present the first representation probing algorithm handling slot permutation invariance. |
| Researcher Affiliation | Academia | Raphaël Dang-Nhu. This work constitutes the public version of Raphaël Dang-Nhu s Master Thesis at ETH Zürich. |
| Pseudocode | Yes | Algorithm 1 Permutation-invariant representation probing |
| Open Source Code | No | The paper mentions using public PyTorch implementations of existing architectures (MONet, GENESIS) and a third-party implementation for IODINE, but it does not state that its own novel metric or probing algorithm code is open-source or provide a link to it. |
| Open Datasets | Yes | We evaluate all models on CLEVR6 (Johnson et al., 2017) and Multi-d Sprites (Matthey et al., 2017; Burgess et al., 2019), with the exception of IODINE that we restricted to Multi-d Sprites for computational reasons, as CLEVR6 requires a week on 8 V100 GPUs per training. |
| Dataset Splits | Yes | Each group has 5000 samples, with a 4000/500/500 split for fitting, validation and evaluation of the factor predictor. |
| Hardware Specification | Yes | CLEVR6 requires a week on 8 V100 GPUs per training. Models were trained with one to four V100 GPUs. This work was granted access to the HPC resources of IDRIS under the allocation 2020-AD011012138 made by GENCI. |
| Software Dependencies | No | The paper mentions using 'public Pytorch implementations' and references third-party implementations for models, but it does not specify version numbers for PyTorch or any other software dependencies (libraries, frameworks, or specific tools) needed to replicate the experiments. |
| Experiment Setup | Yes | We train MONet exactly as in Burgess et al. (2019), except that we set σfg = 0.1 and σbg = 0.06 which was shown to yield better results in (Greff et al., 2019). [...] For the temporary predictors in Algorithm 1 (inside the loop), we use a linear model with Ridge regularization. For the final predictor, we use a random forest with 10 trees, and a maximum depth of 15. [...] We train all models for 200 epochs. |