Object centric architectures enable efficient causal representation learning

Authors: Amin Mansouri, Jason Hartford, Yan Zhang, Yoshua Bengio

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluated our method on 2D and 3D synthetic image datasets that allowed us to carefully control various aspects of the environment, such as the number of objects, their sizes, shapes, colors, relative position, and dynamics. Examples of our 2D and 3D datasets are shown in figures 1,3 respectively. ... We present the disentanglement scores for various combinations of properties and environments with k = {2, 3, 4} number of objects in the scene.
Researcher Affiliation Collaboration Amin Mansouri Mila, Quebec AI Institute Jason Hartford Valence Labs Yan Zhang Samsung SAIT AI Lab, Montreal Yoshua Bengio Mila, Quebec AI Institute / U. Montreal / CIFAR
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks labeled as such.
Open Source Code Yes The code to reproduce our results can be found at: https://github.com/amansouri3476/OC-CRL
Open Datasets No The paper describes generating its own synthetic datasets: 'We use pygame engine (Shinners, 2011) for generating multi-object 2D scenes.' and 'For generating the 3D datasets we leverage kubric library (Greff et al., 2022) to obtain realistic scenes which we can highly customize.' However, no concrete access information (link, DOI, repository) is provided for the generated datasets themselves, nor are they stated to be well-known public datasets.
Dataset Splits Yes For training we generate 1000 pair per target property such that the model on average sees at least 500 samples for either positive or negative perturbations to each property, i.e., if we choose {posx, posy, color} as target properties, we will generate 3000 samples for training. The validation and test sets always have 1000 samples.
Hardware Specification Yes We used a single A100 GPU with 40GB of memory.
Software Dependencies No The paper mentions libraries like pygame and kubric but does not specify their version numbers or other software dependencies with explicit version details required for reproducibility.
Experiment Setup Yes We used a fixed schedule for the learning rate at 2 * 10^-4, and we used Adam W (Loshchilov & Hutter, 2017) with a weight decay of 0.01 along with ϵ = 10^-8, β1 = 0.9, β2 = 0.999. ... we used a batch size of 64 for 2D shapes... and a batch size of 128 for 3D shapes. ... we found the combination of wrecons = 100, wlatent = 10 to strike the optimal balance between maintaining good reconstructions and allowing the slot representations to give rise to disentangled projections.