Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Test-time Training for Matching-based Video Object Segmentation
Authors: Juliette Bertrand, Giorgos Kordopatis Zilos, Yannis Kalantidis, Giorgos Tolias
NeurIPS 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The experimental results on common benchmarks demonstrate that the proposed test-time training yields significant improvements in performance. Our results illustrate that test-time training enhances performance even in these challenging cases. |
| Researcher Affiliation | Collaboration | Juliette Bertrand 1,2 Giorgos Kordopatis-Zilos 1 Yannis Kalantidis2 Giorgos Tolias1 1VRG, FEE, Czech Technical University in Prague 2NAVER LABS Europe |
| Pseudocode | No | The paper does not include a dedicated section or figure explicitly labeled 'Pseudocode' or 'Algorithm'. |
| Open Source Code | Yes | Project page: https://jbertrand89.github.io/test-time-training-vos/ |
| Open Datasets | Yes | DAVIS-2017 validation set [38] and the You Tube VOS-2018 validation set [51]. We further report results on the validation set of the recent MOSE [10] dataset... Additionally, we introduce DAVIS-C... Image Net-C [14]... BL-30K [8] |
| Dataset Splits | Yes | We report results on the two most commonly used benchmarks for video object segmentation, the DAVIS-2017 validation set [38] and the You Tube VOS-2018 validation set [51]. The validation split of the DAVIS-2017 [38] dataset contains 30 videos... The validation split of the You Tube VOS-2018 [51] dataset contains 474 high-quality videos... |
| Hardware Specification | Yes | which typically requires approximately 12.5 hours on 2 A100 GPUs to train STCN. |
| Software Dependencies | No | The paper mentions the use of 'Adam [23] optimizer' and builds on 'STCN2 and XMem3 implementations', but does not provide specific version numbers for software libraries or dependencies like Python, PyTorch, or CUDA. |
| Experiment Setup | Yes | We use learning rates 10 5 and 10 6 for models STCN-BL30K/ XMem -BL30K and STCN-DY/ XMem -DY, respectively since their training data differ significantly. Jump step s for sampling training frames is set to 10. For each test example, we train the models with tt-MCC and tt-Ent for 100 iterations and with tt-AE for 20, using the Adam [23] optimizer and a batch size of 4 sequences for STCN and 1 for XMem. |