MaskRNN: Instance Level Video Object Segmentation

Authors: Yuan-Ting Hu, Jia-Bin Huang, Alexander Schwing

NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We validate the proposed algorithm on three challenging benchmark datasets, the DAVIS-2016 dataset, the DAVIS-2017 dataset, and the Segtrack v2 dataset, achieving state-of-the-art performance on all of them.
Researcher Affiliation Academia Yuan-Ting Hu UIUC ythu2@illinois.edu Jia-Bin Huang Virginia Tech jbhuang@vt.edu Alexander G. Schwing UIUC aschwing@illinois.edu
Pseudocode No The paper describes the architecture and process in text and diagrams (Figure 1 and 2) but does not provide any formal pseudocode or algorithm blocks.
Open Source Code No The paper does not provide an explicit statement about releasing code or a link to a code repository.
Open Datasets Yes We use the training set of the DAVIS dataset to pre-train the appearance network for general-purpose object segmentation. The DAVIS-2016 dataset [37] contains 30 training videos and 20 testing videos and the DAVIS-2017 dataset [39] consists of 60 training videos and 30 testing videos.
Dataset Splits Yes The quantitative evaluation on the validation set of DAVIS dataset [37]. The DAVIS-2016 dataset contains 30 training videos and 20 validation videos. The DAVIS-2017 dataset contains 60 training videos and 30 validation videos.
Hardware Specification No We thank NVIDIA for providing the GPUs used in this research. This statement does not specify the model or number of GPUs used.
Software Dependencies No Note that we use the pre-trained flow Net2.0 [19] for optical flow computation. The paper mentions FlowNet2.0 and Adam solver but does not provide version numbers for these or other software dependencies.
Experiment Setup Yes During offline training all networks are optimized for 10 epochs using the Adam solver [27] and the learning rate is gradually decayed during training, starting from 10 5. We train the network for 200 iterations, and the learning rate is gradually decayed over time.