Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Object-Centric Learning with Slot Attention
Authors: Francesco Locatello, Dirk Weissenborn, Thomas Unterthiner, Aravindh Mahendran, Georg Heigold, Jakob Uszkoreit, Alexey Dosovitskiy, Thomas Kipf
NeurIPS 2020 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The goal of this section is to evaluate the Slot Attention module on two object-centric tasks one being supervised and the other one being unsupervised as described in Sections 2.2 and 2.3. We compare against specialized state-of-the-art methods [16, 17, 31] for each respective task. |
| Researcher Affiliation | Collaboration | 1Google Research, Brain Team 2Dept. of Computer Science, ETH Zurich 3Max-Planck Institute for Intelligent Systems |
| Pseudocode | Yes | Algorithm 1 Slot Attention module. The input is a set of N vectors of dimension Dinputs which is mapped to a set of K slots of dimension Dslots. We initialize the slots by sampling their initial values as independent samples from a Gaussian distribution with shared, learnable parameters ยต RDslots and ฯ RDslots. In our experiments we set the number of iterations to T = 3. |
| Open Source Code | Yes | An implementation of Slot Attention is available at: https://github.com/google-research/ google-research/tree/master/slot_attention. |
| Open Datasets | Yes | For the object discovery experiments, we use the following three multi-object datasets [83]: CLEVR (with masks), Multi-d Sprites, and Tetrominoes. ... [83] Rishabh Kabra, Chris Burgess, Loic Matthey, Raphael Lopez Kaufman, Klaus Greff, Malcolm Reynolds, and Alexander Lerchner. Multi-object datasets. https://github.com/deepmind/multi_object_datasets/, 2019. |
| Dataset Splits | Yes | For set prediction, we use the original CLEVR dataset [84] which contains a training-validation split of 70K and 15K images of rendered objects respectively. |
| Hardware Specification | Yes | On CLEVR6, we can use a batch size of up to 64 on a single V100 GPU with 16GB of RAM as opposed to 4 in [16] using the same type of hardware. ... The Slot Attention model is trained using a single NVIDIA Tesla V100 GPU with 16GB of RAM. |
| Software Dependencies | No | The paper does not specify version numbers for software dependencies such as libraries or frameworks. |
| Experiment Setup | Yes | We train the model using the Adam optimizer [85] with a learning rate of 4 10 4 and a batch size of 64 (using a single GPU). ... At training time, we use T = 3 iterations of Slot Attention. We use the same training setting across all datasets, apart from the number of slots K: we use K = 7 slots for CLEVR6, K = 6 slots for Multi-d Sprites (max. 5 objects per scene), and K = 4 for Tetrominoes (3 objects per scene). |