reproducibilityindex.ai

Delving into the Cyclic Mechanism in Semi-supervised Video Object Segmentation

Authors: Yuxi Li, Ning Xu, Jinlong Peng, John See, Weiyao Lin

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct comprehensive experiments on challenging benchmarks of DAVIS17 and Youtube-VOS, demonstrating that the cyclic mechanism is beneﬁcial to segmentation quality.
Researcher Affiliation	Collaboration	Yuxi Li Shanghai Jiao Tong University Shanghai, China lyxok1@sjtu.edu.cn Ning Xu Adobe Research San Jose, CA nxu@adobe.com Jinlong Peng Tencent Youtu Lab Shanghai, China jeromepeng@tencent.com John See Multimedia University Selangor, Malaysia johnsee@mmu.edu.my Weiyao Lin Shanghai Jiao Tong University Shanghai, China wylin@sjtu.edu.cn
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not include an unambiguous statement that the authors are releasing the source code for their methodology, nor does it provide a direct link to a code repository.
Open Datasets	Yes	Datasets. We train and evaluate our method on two widely used benchmarks for semi-supervised video object segmentation, DAVIS17 [10] and Youtube-VOS [11].
Dataset Splits	Yes	DAVIS17 contains 120 video sequences in total with at most 10 objects in a video. The dataset is split into 60 sequences for training, 30 for validation and the other 30 for test. The Youtube-VOS is larger in scale and contains more object categories. There are a total of 3,471 video sequences for training and 474 videos for validation in this dataset with at most 12 objects in a video.
Hardware Specification	Yes	The training and inference procedures are deployed on an NVIDIA TITAN Xp GPU.
Software Dependencies	No	The paper mentions software components like Resnet50, Image Net, and Adam optimizer, but does not provide specific version numbers for these or other libraries/frameworks.
Experiment Setup	Yes	We set the hyperparameters as γ = 1.0, N = 10, K = 5, and M = 50. The network is trained with a batch size of 4 for 240 epochs in total and is optimized by the Adam optimizer [22] of learning rate 10 5 and β1 = 0.9, β2 = 0.999. In both training and inference stages, the input frames are resized to the resolution of 240 427.