Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Self-supervised Object-Centric Learning for Videos
Authors: Görkay Aydemir, Weidi Xie, Fatma Guney
NeurIPS 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 4 Experiments 4.1 Experimental Setup Datasets: Our proposed method is evaluated on one synthetic and two real-world video datasets. For the synthetic dataset, we select MOVi [25], a widely-used benchmark for evaluating object-centric methods, particularly for multi-object segmentation in videos. ... Metrics: For our synthetic dataset evaluation, we use the foreground adjusted rand index (FG-ARI) to measure the quality of clustering into multiple foreground objects. |
| Researcher Affiliation | Academia | Görkay Aydemir1 Weidi Xie3, 4 Fatma Güney1,2 1 Department of Computer Engineering, Koç University 2 KUIS AI Center 3 CMIC, Shanghai Jiao Tong University 4 Shanghai AI Laboratory |
| Pseudocode | No | The paper describes its methodology and architecture in text and figures, but it does not provide any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | 1Project page: https://kuis-ai.github.io/solv |
| Open Datasets | Yes | Datasets: Our proposed method is evaluated on one synthetic and two real-world video datasets. For the synthetic dataset, we select MOVi [25]... Additionally, we evaluate our method on a subset of the Youtube-VIS 2019 (YTVIS19) [87] train set... For real-world datasets, we use the validation split of DAVIS17 [65]. |
| Dataset Splits | Yes | For real-world datasets, we use the validation split of DAVIS17 [65]. In addition, we evaluate our method on a subset of the Youtube-VIS 2019 (YTVIS19) [87] train set, because there is no official validation or test set provided with ground-truth masks. |
| Hardware Specification | Yes | We train our models on 2 V100 GPUs using the Adam [39] optimizer with a batch size of 48. |
| Software Dependencies | No | The paper mentions using 'the sklearn library [64]' for Agglomerative Clustering but does not provide a specific version number for scikit-learn or any other software dependencies with their versions. |
| Experiment Setup | Yes | We set the number of consecutive frame range n to 2 and drop half of the tokens before the slot attention step. We train our models on 2 V100 GPUs using the Adam [39] optimizer with a batch size of 48. We clip the gradient norms at 1 to stabilize the training. ... MOVi-E: We train our model from scratch for a total of 60 epochs... We use a maximum learning rate of 4 10 4 and an exponential decay schedule... The model is trained using 18 slots and the input frames are adjusted to a size of 336 336... The slot merge coefficient in ψmerge is configured to 0.12. |