Exploit Domain-Robust Optical Flow in Domain Adaptive Video Semantic Segmentation

Authors: Yuan Gao, Zilei Wang, Jiafan Zhuang, Yixin Zhang, Junjie Li

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The extensive experiments on two challenging benchmarks demonstrate the effectiveness of our method, and it outperforms previous state-of-the-art methods with considerable performance improvement. Our code is available at https://github.com/EdenHazardan/SFC.
Researcher Affiliation Academia Yuan Gao1, Zilei Wang*1 , Jiafan Zhuang2, Yixin Zhang1,3, Junjie Li1 1 University of Science and Technology of China 2 Shantou University 3 Institute of Artificial Intelligence, Hefei Comprehensive National Science Center
Pseudocode No The paper describes methods via text and architectural diagrams (e.g., Figure 1, Figure 4) but does not provide formal pseudocode or algorithm blocks.
Open Source Code Yes Our code is available at https://github.com/EdenHazardan/SFC.
Open Datasets Yes Following DA-VSN (Guan et al. 2021) and TPS (Xing et al. 2022), our experiments involve two challenging synthetic-to-real benchmarks: VIPER Cityscapes-Seq and SYNTHIA-Seq Cityscapes-Seq. Cityscapes-Seq (Cordts et al. 2016) is a representative dataset in semantic segmentation and autonomous driving domain. We use it as the target domain dataset without using any annotations during training. VIPER (Richter, Hayder, and Koltun 2017) is a synthetic dataset... SYNTHIA-Seq (Ros et al. 2016) is also a synthetic dataset...
Dataset Splits Yes The training and validation subsets contain 2, 975 and 500 videos, respectively, and each video contains 30 frames at a resolution of 1024 2048.
Hardware Specification No We acknowledge the support of GPU cluster built by MCC Lab of Information Science and Technology Institution, USTC.
Software Dependencies No We adopt Accel (Jain, Wang, and Gonzalez 2019) throughout experiments. It consists of two segmentation branches, an optical flow network, and a score fusion layer. Two segmentation branches are used to generate semantic predictions on consecutive frames using Deeplab (Chen et al. 2017), whose backbones are both Res Net-101 (He et al. 2016) pretrained on Image Net (Deng et al. 2009). Flow Net (Dosovitskiy et al. 2015) is adopted as an optical flow network to propagate prediction from the previous frame, which is pretrained on Flying Chairs dataset (Dosovitskiy et al. 2015).
Experiment Setup Yes Implementation details As in DA-VSN (Guan et al. 2021) and TPS (Xing et al. 2022), we adopt Accel (Jain, Wang, and Gonzalez 2019) throughout experiments... uses an SGD optimizer with a momentum of 0.9 and a weight decay of 5 10 4. The learning rate is set at 2.5 10 4 for backbone parameters and 2.5 10 3 for others, which is annealed following the poly learning rate policy... We set λf as 0.005 and 0.001 in two training stages respectively, while set λs = 100 in both stages.