Identifiable Object-Centric Representation Learning via Probabilistic Slot Attention
Authors: Avinash Kori, Francesco Locatello, Ainkaran Santhirasekaram, Francesca Toni, Ben Glocker, Fabio De Sousa Ribeiro
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We provide empirical verification of our theoretical identifiability result using both simple 2-dimensional data and high-resolution imaging datasets. |
| Researcher Affiliation | Academia | Avinash Kori1 Francesco Locatello2 Ainkaran Santhirasekaram1 Francesca Toni1 Ben Glocker1 Fabio De Sousa Ribeiro1 1 Imperial College London, UK 2 Institute of Science and Technology, Austria |
| Pseudocode | Yes | Algorithm 1 Probabilistic Slot Attention |
| Open Source Code | Yes | We provide codebase with the described set of hyper-parameters for reproducability. |
| Open Datasets | Yes | Our experimental analysis involves standard benchmark datasets from object-centric learning literature including SPRITEWORLD [6], CLEVR [34], and OBJECTSROOM [35]. |
| Dataset Splits | No | We used 1000 data points in total for training our PSA model. |
| Hardware Specification | No | E.g. for a CNN with 500K parameters with batch size 32, 125GB of GPU memory is needed |
| Software Dependencies | No | The paper does not explicitly list specific software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x, CUDA x.x). |
| Experiment Setup | Yes | We experiment with two types of decoders: (i) an additive decoder similar to [79] s spatial broadcasting model; and (ii) standard convolutional decoder. In all cases, we use Leaky Re LU activations to satisfy the weak injectivity conditions (Assumption 8)... In this case, we found that a lower maximum learning rate of 10-4 was beneficial for stabilizing PSA training. |