Object centric architectures enable efficient causal representation learning
Authors: Amin Mansouri, Jason Hartford, Yan Zhang, Yoshua Bengio
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluated our method on 2D and 3D synthetic image datasets that allowed us to carefully control various aspects of the environment, such as the number of objects, their sizes, shapes, colors, relative position, and dynamics. Examples of our 2D and 3D datasets are shown in figures 1,3 respectively. ... We present the disentanglement scores for various combinations of properties and environments with k = {2, 3, 4} number of objects in the scene. |
| Researcher Affiliation | Collaboration | Amin Mansouri Mila, Quebec AI Institute Jason Hartford Valence Labs Yan Zhang Samsung SAIT AI Lab, Montreal Yoshua Bengio Mila, Quebec AI Institute / U. Montreal / CIFAR |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks labeled as such. |
| Open Source Code | Yes | The code to reproduce our results can be found at: https://github.com/amansouri3476/OC-CRL |
| Open Datasets | No | The paper describes generating its own synthetic datasets: 'We use pygame engine (Shinners, 2011) for generating multi-object 2D scenes.' and 'For generating the 3D datasets we leverage kubric library (Greff et al., 2022) to obtain realistic scenes which we can highly customize.' However, no concrete access information (link, DOI, repository) is provided for the generated datasets themselves, nor are they stated to be well-known public datasets. |
| Dataset Splits | Yes | For training we generate 1000 pair per target property such that the model on average sees at least 500 samples for either positive or negative perturbations to each property, i.e., if we choose {posx, posy, color} as target properties, we will generate 3000 samples for training. The validation and test sets always have 1000 samples. |
| Hardware Specification | Yes | We used a single A100 GPU with 40GB of memory. |
| Software Dependencies | No | The paper mentions libraries like pygame and kubric but does not specify their version numbers or other software dependencies with explicit version details required for reproducibility. |
| Experiment Setup | Yes | We used a fixed schedule for the learning rate at 2 * 10^-4, and we used Adam W (Loshchilov & Hutter, 2017) with a weight decay of 0.01 along with ϵ = 10^-8, β1 = 0.9, β2 = 0.999. ... we used a batch size of 64 for 2D shapes... and a batch size of 128 for 3D shapes. ... we found the combination of wrecons = 100, wlatent = 10 to strike the optimal balance between maintaining good reconstructions and allowing the slot representations to give rise to disentangled projections. |