Test-time Training for Matching-based Video Object Segmentation
Authors: Juliette Bertrand, Giorgos Kordopatis Zilos, Yannis Kalantidis, Giorgos Tolias
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The experimental results on common benchmarks demonstrate that the proposed test-time training yields significant improvements in performance. Our results illustrate that test-time training enhances performance even in these challenging cases. |
| Researcher Affiliation | Collaboration | Juliette Bertrand 1,2 Giorgos Kordopatis-Zilos 1 Yannis Kalantidis2 Giorgos Tolias1 1VRG, FEE, Czech Technical University in Prague 2NAVER LABS Europe |
| Pseudocode | No | The paper does not include a dedicated section or figure explicitly labeled 'Pseudocode' or 'Algorithm'. |
| Open Source Code | Yes | Project page: https://jbertrand89.github.io/test-time-training-vos/ |
| Open Datasets | Yes | DAVIS-2017 validation set [38] and the You Tube VOS-2018 validation set [51]. We further report results on the validation set of the recent MOSE [10] dataset... Additionally, we introduce DAVIS-C... Image Net-C [14]... BL-30K [8] |
| Dataset Splits | Yes | We report results on the two most commonly used benchmarks for video object segmentation, the DAVIS-2017 validation set [38] and the You Tube VOS-2018 validation set [51]. The validation split of the DAVIS-2017 [38] dataset contains 30 videos... The validation split of the You Tube VOS-2018 [51] dataset contains 474 high-quality videos... |
| Hardware Specification | Yes | which typically requires approximately 12.5 hours on 2 A100 GPUs to train STCN. |
| Software Dependencies | No | The paper mentions the use of 'Adam [23] optimizer' and builds on 'STCN2 and XMem3 implementations', but does not provide specific version numbers for software libraries or dependencies like Python, PyTorch, or CUDA. |
| Experiment Setup | Yes | We use learning rates 10 5 and 10 6 for models STCN-BL30K/ XMem -BL30K and STCN-DY/ XMem -DY, respectively since their training data differ significantly. Jump step s for sampling training frames is set to 10. For each test example, we train the models with tt-MCC and tt-Ent for 100 iterations and with tt-AE for 20, using the Adam [23] optimizer and a batch size of 4 sequences for STCN and 1 for XMem. |