Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Object-Centric Learning with Slot Attention

Authors: Francesco Locatello, Dirk Weissenborn, Thomas Unterthiner, Aravindh Mahendran, Georg Heigold, Jakob Uszkoreit, Alexey Dosovitskiy, Thomas Kipf

NeurIPS 2020 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The goal of this section is to evaluate the Slot Attention module on two object-centric tasks one being supervised and the other one being unsupervised as described in Sections 2.2 and 2.3. We compare against specialized state-of-the-art methods [16, 17, 31] for each respective task.
Researcher Affiliation Collaboration 1Google Research, Brain Team 2Dept. of Computer Science, ETH Zurich 3Max-Planck Institute for Intelligent Systems
Pseudocode Yes Algorithm 1 Slot Attention module. The input is a set of N vectors of dimension Dinputs which is mapped to a set of K slots of dimension Dslots. We initialize the slots by sampling their initial values as independent samples from a Gaussian distribution with shared, learnable parameters ยต RDslots and ฯƒ RDslots. In our experiments we set the number of iterations to T = 3.
Open Source Code Yes An implementation of Slot Attention is available at: https://github.com/google-research/ google-research/tree/master/slot_attention.
Open Datasets Yes For the object discovery experiments, we use the following three multi-object datasets [83]: CLEVR (with masks), Multi-d Sprites, and Tetrominoes. ... [83] Rishabh Kabra, Chris Burgess, Loic Matthey, Raphael Lopez Kaufman, Klaus Greff, Malcolm Reynolds, and Alexander Lerchner. Multi-object datasets. https://github.com/deepmind/multi_object_datasets/, 2019.
Dataset Splits Yes For set prediction, we use the original CLEVR dataset [84] which contains a training-validation split of 70K and 15K images of rendered objects respectively.
Hardware Specification Yes On CLEVR6, we can use a batch size of up to 64 on a single V100 GPU with 16GB of RAM as opposed to 4 in [16] using the same type of hardware. ... The Slot Attention model is trained using a single NVIDIA Tesla V100 GPU with 16GB of RAM.
Software Dependencies No The paper does not specify version numbers for software dependencies such as libraries or frameworks.
Experiment Setup Yes We train the model using the Adam optimizer [85] with a learning rate of 4 10 4 and a batch size of 64 (using a single GPU). ... At training time, we use T = 3 iterations of Slot Attention. We use the same training setting across all datasets, apart from the number of slots K: we use K = 7 slots for CLEVR6, K = 6 slots for Multi-d Sprites (max. 5 objects per scene), and K = 4 for Tetrominoes (3 objects per scene).