reproducibilityindex.ai

Mask Propagation for Efficient Video Semantic Segmentation

Authors: Yuetian Weng, Mingfei Han, Haoyu He, Mingjie Li, Lina Yao, Xiaojun Chang, Bohan Zhuang

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on VSPW and Cityscapes demonstrate that our mask propagation framework achieves SOTA accuracy and efficiency trade-offs.
Researcher Affiliation	Collaboration	1ZIP Lab, Monash University 2Baidu Inc. 3Re LER, AAII, UTS 4Data61, CSIRO 5Mohamed bin Zayed University of AI
Pseudocode	No	The paper describes the method using textual descriptions and equations, but does not provide a formal pseudocode block or algorithm.
Open Source Code	Yes	Code is available at https://github.com/ziplab/MPVSS.
Open Datasets	Yes	We evaluate our method on two benchmark datasets: VSPW [42] and Cityscapes [9].
Dataset Splits	Yes	VSPW is the largest video semantic segmentation benchmark, consisting of 2,806 training clips (198,244 frames), 343 validation clips (24,502 frames), and 387 test clips (28,887 frames).
Hardware Specification	Yes	Frame-per-second (FPS) is measured on a single NVIDIA V100 GPU with 3 repeated runs.
Software Dependencies	No	The paper mentions software components like Mask2Former, Flow Net, and AdamW optimizer, but does not specify their version numbers or the versions of the programming languages/libraries used.
Experiment Setup	Yes	By default, all experiments are trained with a batch size of 16 on 8 NVIDIA GPUs. All the models are trained with the Adam W optimizer [41] for a maximum of 90k iterations and the polynomial learning rate decay schedule [4] with an initial learning rate of 5e-5. For our proposed models, we use 5 as the default key frame interval for comparison.