Delving into the Cyclic Mechanism in Semi-supervised Video Object Segmentation
Authors: Yuxi Li, Ning Xu, Jinlong Peng, John See, Weiyao Lin
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct comprehensive experiments on challenging benchmarks of DAVIS17 and Youtube-VOS, demonstrating that the cyclic mechanism is beneficial to segmentation quality. |
| Researcher Affiliation | Collaboration | Yuxi Li Shanghai Jiao Tong University Shanghai, China lyxok1@sjtu.edu.cn Ning Xu Adobe Research San Jose, CA nxu@adobe.com Jinlong Peng Tencent Youtu Lab Shanghai, China jeromepeng@tencent.com John See Multimedia University Selangor, Malaysia johnsee@mmu.edu.my Weiyao Lin Shanghai Jiao Tong University Shanghai, China wylin@sjtu.edu.cn |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not include an unambiguous statement that the authors are releasing the source code for their methodology, nor does it provide a direct link to a code repository. |
| Open Datasets | Yes | Datasets. We train and evaluate our method on two widely used benchmarks for semi-supervised video object segmentation, DAVIS17 [10] and Youtube-VOS [11]. |
| Dataset Splits | Yes | DAVIS17 contains 120 video sequences in total with at most 10 objects in a video. The dataset is split into 60 sequences for training, 30 for validation and the other 30 for test. The Youtube-VOS is larger in scale and contains more object categories. There are a total of 3,471 video sequences for training and 474 videos for validation in this dataset with at most 12 objects in a video. |
| Hardware Specification | Yes | The training and inference procedures are deployed on an NVIDIA TITAN Xp GPU. |
| Software Dependencies | No | The paper mentions software components like Resnet50, Image Net, and Adam optimizer, but does not provide specific version numbers for these or other libraries/frameworks. |
| Experiment Setup | Yes | We set the hyperparameters as γ = 1.0, N = 10, K = 5, and M = 50. The network is trained with a batch size of 4 for 240 epochs in total and is optimized by the Adam optimizer [22] of learning rate 10 5 and β1 = 0.9, β2 = 0.999. In both training and inference stages, the input frames are resized to the resolution of 240 427. |