Identifiable Object-Centric Representation Learning via Probabilistic Slot Attention

Authors: Avinash Kori, Francesco Locatello, Ainkaran Santhirasekaram, Francesca Toni, Ben Glocker, Fabio De Sousa Ribeiro

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We provide empirical verification of our theoretical identifiability result using both simple 2-dimensional data and high-resolution imaging datasets.
Researcher Affiliation Academia Avinash Kori1 Francesco Locatello2 Ainkaran Santhirasekaram1 Francesca Toni1 Ben Glocker1 Fabio De Sousa Ribeiro1 1 Imperial College London, UK 2 Institute of Science and Technology, Austria
Pseudocode Yes Algorithm 1 Probabilistic Slot Attention
Open Source Code Yes We provide codebase with the described set of hyper-parameters for reproducability.
Open Datasets Yes Our experimental analysis involves standard benchmark datasets from object-centric learning literature including SPRITEWORLD [6], CLEVR [34], and OBJECTSROOM [35].
Dataset Splits No We used 1000 data points in total for training our PSA model.
Hardware Specification No E.g. for a CNN with 500K parameters with batch size 32, 125GB of GPU memory is needed
Software Dependencies No The paper does not explicitly list specific software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x, CUDA x.x).
Experiment Setup Yes We experiment with two types of decoders: (i) an additive decoder similar to [79] s spatial broadcasting model; and (ii) standard convolutional decoder. In all cases, we use Leaky Re LU activations to satisfy the weak injectivity conditions (Assumption 8)... In this case, we found that a lower maximum learning rate of 10-4 was beneficial for stabilizing PSA training.