When Visual Prompt Tuning Meets Source-Free Domain Adaptive Semantic Segmentation
Authors: Xinhong Ma, Yiming Wang, Hao Liu, Tianyu Guo, Yunhe Wang
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments demonstrate that Uni-UVPT achieves state-of-the-art performance on GTA5 Cityscapes and SYNTHIA Cityscapes tasks |
| Researcher Affiliation | Industry | Xinhong Ma, Yiming Wang, Hao Liu, Tianyu Guo, Yunhe Wang Huawei Noah s Ark Lab {maxinhong, wangyiming22, liuhao296, tianyu.guo, yunhe.wang}@huawei.com |
| Pseudocode | No | The paper describes its methods textually and with architectural diagrams (Figure 1), but does not include any explicit pseudocode or algorithm blocks. |
| Open Source Code | No | Code will be available at https://gitee.com/mindspore/ models/tree/master/research/cv/uni-uvpt and https://github.com/ huawei-noah/noah-research/tree/master/uni-uvpt. |
| Open Datasets | Yes | GTA5 [37] is a large-scale driving-game dataset... The realistic dataset Cityscapes [10] collects street view scenes... Synthia dataset [38] is rendered from a virtual city... |
| Dataset Splits | Yes | The realistic dataset Cityscapes [10] collects street view scenes from 50 different cities with 19 classes, including 2,975 training images and 500 validation images. |
| Hardware Specification | Yes | Our approach is implemented based on the MMSegmentation framework [9] and one training task requires one NVIDIA Tesla V100 GPU. |
| Software Dependencies | No | The paper mentions using the 'MMSegmentation framework [9]' and specific models like 'Swin-B [31]' and 'Mi T-B5 [43]', but does not provide specific version numbers for these software components or other dependencies (e.g., Python, PyTorch versions). |
| Experiment Setup | Yes | Specifically, the learning rate of Swin-B encoder is set as 6 10 6 and 4 10 6 for the Mi T-B5 encoder, while the learning rates of the segmentation head and prompt adapter are respectively set as ten and five times of backbone. Our Uni-UVPT framework typically needs 40k-80k iterations with a batch size of 1 until convergence. |