Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Test-time Adaptation with Slot-Centric Models
Authors: Mihir Prabhudesai, Anirudh Goyal, Sujoy Paul, Sjoerd Van Steenkiste, Mehdi S. M. Sajjadi, Gaurav Aggarwal, Thomas Kipf, Deepak Pathak, Katerina Fragkiadaki
ICML 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate Slot-TTA across multiple input modalities, images or 3D point clouds, and show substantial out-of-distribution performance improvements against state-of-the-art supervised feed-forward detectors, and alternative test-time adaptation methods. 4. Experiments |
| Researcher Affiliation | Collaboration | 1Carnegie Mellon University 2Mila, Deep Mind 3Google Research. |
| Pseudocode | No | The paper describes its methods through text and mathematical equations but does not include structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Project Webpage: http://slot-tta.github.io/ Our code is publicly available to the community on our project webpage: http://slot-tta.github.io. |
| Open Datasets | Yes | We test Slot-TTA in scene segmentation of multi-view posed images, single-view images and 3D point clouds in the datasets of Part Net (Mo et al., 2019), Multi Shape Net-Hard (Sajjadi et al., 2022b) and CLEVR (Johnson et al., 2017). |
| Dataset Splits | No | The paper describes train and test splits, but a distinct 'validation' split for hyperparameter tuning or early stopping is not explicitly mentioned. |
| Hardware Specification | Yes | Test-time adaptation for each example takes about 10 seconds on a single TPUv2 chip. We use a single V100 GPU for training and inference. |
| Software Dependencies | No | The paper mentions optimizers like Adam but does not specify any software libraries, frameworks, or programming languages with version numbers (e.g., PyTorch, TensorFlow, Python version). |
| Experiment Setup | Yes | We use a batch size of 256 in this setting. We set our learning rate as 10 4. We use an Adam optimizer with β1 = 0.9, β2 = 0.999. For training, our model takes about 4 days to converge using 64 TPUv2 chips. |