Multi-source Domain Adaptation for Semantic Segmentation

Authors: Sicheng Zhao, Bo Li, Xiangyu Yue, Yang Gu, Pengfei Xu, Runbo Hu, Hua Chai, Kurt Keutzer

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments from synthetic GTA and SYNTHIA to real Cityscapes and BDDS datasets demonstrate that the proposed MADAN model outperforms state-of-the-art approaches.
Researcher Affiliation Collaboration Sicheng Zhao1 , Bo Li23 , Xiangyu Yue1 , Yang Gu2, Pengfei Xu2, Runbo Hu2, Hua Chai2, Kurt Keutzer1 1University of California, Berkeley, USA 2Didi Chuxing, China 3Harbin Institute of Technology, China
Pseudocode No The paper provides a high-level framework diagram (Figure 1) and describes the components and training process in text, but it does not include any pseudocode or algorithm blocks.
Open Source Code Yes Our source code is released at: https://github.com/Luodian/MADAN.
Open Datasets Yes In our adaptation experiments, we use synthetic GTA [18] and SYNTHIA [19] datasets as the source domains and real Cityscapes [15] and BDDS [72] datasets as the target domains.
Dataset Splits No The paper mentions training images are cropped to 400x400 during the training of pixel-level adaptation for 20 epochs, but it does not specify a separate validation dataset split, percentage, or number of samples used for validation.
Hardware Specification Yes In our experiments, MADAN is trained on 4 NVIDIA Tesla P40 GPUs for 40 hours using two source domains which is about twice the training time as on a single source.
Software Dependencies No The paper states: "The network is implemented in Py Torch and trained with Adam optimizer [75]..." While PyTorch and Adam optimizer are mentioned, specific version numbers for these software dependencies are not provided.
Experiment Setup Yes The network is implemented in Py Torch and trained with Adam optimizer [75] using a batch size of 8 with initial learning rate 1e-4. All the images are resized to 600 1080, and are then cropped to 400 400 during the training of the pixel-level adaptation for 20 epochs. SAD and CCD are frozen in the first 5 and 10 epochs, respectively.