Addressing Domain Gap via Content Invariant Representation for Semantic Segmentation
Authors: Li Gao, Lefei Zhang, Qian Zhang7528-7536
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Comprehensive experiments on two domain adaptation tasks, that is, GTAV Cityscapes and SYNTHIA Cityscapes, clearly demonstrate the superiority of our method compared with state-of-the-art methods. |
| Researcher Affiliation | Collaboration | 1 National Engineering Research Center for Multimedia Software, Institute of Artificial Intelligence, School of Computer Science and Hubei Key Laboratory of Multimedia and Network Communication Engineering, Wuhan University, Wuhan, China. 2 Horizon Robotics, Inc., Beijing, China. |
| Pseudocode | No | The paper includes diagrams (e.g., Figure 2 and Figure 3) and descriptive text for its methods, but it does not contain any formal pseudocode blocks or algorithms labeled as such. |
| Open Source Code | No | The paper does not provide a direct link to the source code or explicitly state that the code for the described methodology is publicly available. |
| Open Datasets | Yes | Following the experimental setup of previous works (Tsai et al. 2019; Chang et al. 2019), we conduct extensive experiments on two adaptation tasks, that is, GTAV (Richter et al. 2016) to Cityscapes (Cordts et al. 2016) and SYNTHIA (Ros et al. 2016) to Cityscapes. GTAV is a dataset, which contains 24,966 synthetic urban scene images with a resolution of 1, 914 1, 052. For training, we consider 19 common categories semantic labels to be compatible with the Cityscapes dataset. SYNTHIA: SYNTHIA-RAND-CITYSCAPES is another photorealistic synthetic image dataset, which consists of 9,400 images with a resolution of 1, 280 760. We validate on 16 common classes with the Cityscapes dataset, and the evaluation of 13 classes is also reported. Cityscapes is a real-world collected dataset which provides 5,000 densely annotated images with 2, 048 1, 024 resolution. We use 2,975 training images for training and 500 validation images for testing. |
| Dataset Splits | Yes | Cityscapes is a real-world collected dataset which provides 5,000 densely annotated images with 2, 048 1, 024 resolution. We use 2,975 training images for training and 500 validation images for testing. |
| Hardware Specification | Yes | We implement the proposed framework using the Py Torch toolbox on a single Tesla V100 GPU with 16 GB memory. |
| Software Dependencies | No | The paper mentions software components like 'PyTorch toolbox', 'Adam optimizer', 'Deep Lab-v2', 'Res Net101', 'Image Net', and 'SGD' but does not specify their version numbers. |
| Experiment Setup | Yes | The model is trained using Adam (Kingma and Ba 2014) optimizer with the initial learning rate of 2 × 10−4 and β1 = 0.5, β2 = 0.999. Batch-size is set to 1 for all stages. During training, we use SGD (Bottou 2010) as the optimizer for segmentation network with a momentum of 0.9 and initial learning rate of 2.5 × 10−4, while utilizing Adam to optimize D with β1 = 0.9, β2 = 0.99 and initializing the learning rate to 1 × 10−4. We set optimizers a weight decay of 5 × 10−4 with a poly learning rate decay policy. We train the network for 100,000 iterations. We resize Cityscapes, GTAV, SYNTHIA to 1, 024 × 512, 1, 280 × 720, and 1, 280 × 760 respectively. We also set K, λadv, λcos and ϵ to 2, 0.001, 40, and 0.4. |