Generalized Semantic Segmentation by Self-Supervised Source Domain Projection and Multi-Level Contrastive Learning

Authors: Liwei Yang, Xiang Gu, Jian Sun

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments for semantic segmentation demonstrate the favorable generalization capability of our method on benchmark datasets.
Researcher Affiliation Collaboration 1 School of Mathematics and Statistics, Xi an Jiaotong University, China 2 Pazhou Laboratory (Huangpu), China 3 Peng Cheng Laboratory, China {yangliwei, xianggu}@stu.xjtu.edu.cn, jiansun@xjtu.edu.cn
Pseudocode No The paper describes its methods and processes using textual descriptions and diagrams (e.g., Figure 2, 3, 4), but it does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes Our code is available at https://github.com/liweiyangv/DPCL.
Open Datasets Yes Synthetic Datasets. GTAV (G) (Richter et al. 2016) is a synthetic dataset... SYNTHIA (S) (Ros et al. 2016) is an another synthetic dataset. ... Real-World Datasets. Cityscapes (C) (Cordts et al. 2016) is a high resolution dataset... BDD (B) (Yu et al. 2020) is another real-world dataset... The last real-world dataset we use is Mapillary (M) (Neuhold et al. 2017).
Dataset Splits Yes GTAV (G) (Richter et al. 2016) is a synthetic dataset, which contains 24966 images with resolution of 1914 1052 along with their pixel-wise semantic labels, and it has 12,403, 6,382, and 6,181 images for training, validation, and test, respectively.
Hardware Specification No The paper does not provide specific hardware details such as exact GPU/CPU models, processor types, memory amounts, or detailed computer specifications used for running its experiments.
Software Dependencies No The paper mentions various models and optimizers (e.g., Deep Lab V3+, Res Net50, SGD optimizer) but does not provide specific version numbers for any software, libraries, or frameworks (e.g., Python, PyTorch, CUDA).
Experiment Setup Yes Implementation Details. We use Res Net50 (He et al. 2016), Shuffle Net V2 (Ma et al. 2018) and Mobile Net V2 (Sandler et al. 2018) as our segmentation backbones for the task GTAV to Cityscapes, BDD and Mapillary and the task Cityscapes to BDD, SYNTHIA and GTAV. We take SGD optimizer with an initial learning rate of 1e-3, and train segmentation model for 40k iterations with batch size of 8, momentum of 0.9 and weight decay of 5e-4. We adopt the polynomial learning rate scheduling (Liu, Rabinovich, and Berg 2015) with the power of 0.9. ... In the multi-level contrastive learning, we sample thirty pixel features in each class from a batch of images... We respectively use q = 10 and q = 5 for the task trained on GTAV and Cityscapes. The other parameters are set as ξ = 0.5, τ = 0.1, λ = 5. ... we warm up our segmentation model only using Ltask for ten epochs and then use Ltotal to train.