Generalized Semantic Segmentation by Self-Supervised Source Domain Projection and Multi-Level Contrastive Learning
Authors: Liwei Yang, Xiang Gu, Jian Sun
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments for semantic segmentation demonstrate the favorable generalization capability of our method on benchmark datasets. |
| Researcher Affiliation | Collaboration | 1 School of Mathematics and Statistics, Xi an Jiaotong University, China 2 Pazhou Laboratory (Huangpu), China 3 Peng Cheng Laboratory, China {yangliwei, xianggu}@stu.xjtu.edu.cn, jiansun@xjtu.edu.cn |
| Pseudocode | No | The paper describes its methods and processes using textual descriptions and diagrams (e.g., Figure 2, 3, 4), but it does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code is available at https://github.com/liweiyangv/DPCL. |
| Open Datasets | Yes | Synthetic Datasets. GTAV (G) (Richter et al. 2016) is a synthetic dataset... SYNTHIA (S) (Ros et al. 2016) is an another synthetic dataset. ... Real-World Datasets. Cityscapes (C) (Cordts et al. 2016) is a high resolution dataset... BDD (B) (Yu et al. 2020) is another real-world dataset... The last real-world dataset we use is Mapillary (M) (Neuhold et al. 2017). |
| Dataset Splits | Yes | GTAV (G) (Richter et al. 2016) is a synthetic dataset, which contains 24966 images with resolution of 1914 1052 along with their pixel-wise semantic labels, and it has 12,403, 6,382, and 6,181 images for training, validation, and test, respectively. |
| Hardware Specification | No | The paper does not provide specific hardware details such as exact GPU/CPU models, processor types, memory amounts, or detailed computer specifications used for running its experiments. |
| Software Dependencies | No | The paper mentions various models and optimizers (e.g., Deep Lab V3+, Res Net50, SGD optimizer) but does not provide specific version numbers for any software, libraries, or frameworks (e.g., Python, PyTorch, CUDA). |
| Experiment Setup | Yes | Implementation Details. We use Res Net50 (He et al. 2016), Shuffle Net V2 (Ma et al. 2018) and Mobile Net V2 (Sandler et al. 2018) as our segmentation backbones for the task GTAV to Cityscapes, BDD and Mapillary and the task Cityscapes to BDD, SYNTHIA and GTAV. We take SGD optimizer with an initial learning rate of 1e-3, and train segmentation model for 40k iterations with batch size of 8, momentum of 0.9 and weight decay of 5e-4. We adopt the polynomial learning rate scheduling (Liu, Rabinovich, and Berg 2015) with the power of 0.9. ... In the multi-level contrastive learning, we sample thirty pixel features in each class from a batch of images... We respectively use q = 10 and q = 5 for the task trained on GTAV and Cityscapes. The other parameters are set as ξ = 0.5, τ = 0.1, λ = 5. ... we warm up our segmentation model only using Ltask for ten epochs and then use Ltotal to train. |