reproducibilityindex.ai

Evaluating Disentanglement of Structured Representations

Authors: Raphaël Dang-Nhu

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimentally, we demonstrate that viewing object compositionality as a disentanglement problem addresses several issues with prior visual metrics of object separation. As a core technical component, we present the ﬁrst representation probing algorithm handling slot permutation invariance.
Researcher Affiliation	Academia	Raphaël Dang-Nhu. This work constitutes the public version of Raphaël Dang-Nhu s Master Thesis at ETH Zürich.
Pseudocode	Yes	Algorithm 1 Permutation-invariant representation probing
Open Source Code	No	The paper mentions using public PyTorch implementations of existing architectures (MONet, GENESIS) and a third-party implementation for IODINE, but it does not state that its own novel metric or probing algorithm code is open-source or provide a link to it.
Open Datasets	Yes	We evaluate all models on CLEVR6 (Johnson et al., 2017) and Multi-d Sprites (Matthey et al., 2017; Burgess et al., 2019), with the exception of IODINE that we restricted to Multi-d Sprites for computational reasons, as CLEVR6 requires a week on 8 V100 GPUs per training.
Dataset Splits	Yes	Each group has 5000 samples, with a 4000/500/500 split for ﬁtting, validation and evaluation of the factor predictor.
Hardware Specification	Yes	CLEVR6 requires a week on 8 V100 GPUs per training. Models were trained with one to four V100 GPUs. This work was granted access to the HPC resources of IDRIS under the allocation 2020-AD011012138 made by GENCI.
Software Dependencies	No	The paper mentions using 'public Pytorch implementations' and references third-party implementations for models, but it does not specify version numbers for PyTorch or any other software dependencies (libraries, frameworks, or specific tools) needed to replicate the experiments.
Experiment Setup	Yes	We train MONet exactly as in Burgess et al. (2019), except that we set σfg = 0.1 and σbg = 0.06 which was shown to yield better results in (Greff et al., 2019). [...] For the temporary predictors in Algorithm 1 (inside the loop), we use a linear model with Ridge regularization. For the ﬁnal predictor, we use a random forest with 10 trees, and a maximum depth of 15. [...] We train all models for 200 epochs.