Cross-Modal Contrastive Learning for Domain Adaptation in 3D Semantic Segmentation
Authors: Bowei Xing, Xianghua Ying, Ruibin Wang, Jinfa Yang, Taiyan Chen
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our method on three unsupervised domain adaptation scenarios, including country-to-country, day-to-night, and dataset-to-dataset. Experimental results show that our approach outperforms existing methods, which demonstrates the effectiveness of the proposed method. |
| Researcher Affiliation | Academia | Key Laboratory of Machine Perception (MOE) School of Intelligence Science and Technology, Peking University {xingbowei, xhying, robin wang, jinfayang}@pku.edu.cn, chenty@stu.pku.edu.cn |
| Pseudocode | No | The paper includes architectural diagrams (Figure 1 and 2) but does not contain any formal pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain an explicit statement about releasing the source code for the described methodology, nor does it provide any links to a code repository. |
| Open Datasets | Yes | Three autonomous driving datasets: nu Scenes (Caesar et al. 2020), A2D2 (Geyer et al. 2020) and Semantic KITTI (Behley et al. 2019) are adopted. |
| Dataset Splits | No | The paper mentions using nu Scenes, A2D2, and Semantic KITTI datasets for experiments and adaptation scenarios like USA/Singapore and Day/Night, but it does not explicitly state the training, validation, and test dataset splits used for reproduction. |
| Hardware Specification | No | The paper mentions 'empirical GPU memory concern' but does not provide any specific details about the hardware used, such as GPU models, CPU types, or memory specifications. |
| Software Dependencies | No | The paper mentions using U-Net and Sparse Conv Net as backbones but does not provide specific version numbers for software dependencies like programming languages, libraries, or frameworks (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | In training process, the learning rate is set to 0.001 at initial and is divided by 10 at 80k and 90k iterations. We totally train the model for 100k iterations on each adaptation scenario. For the neighborhood features, we adopt the nearby region of 5 × 5. For dilated neighbor features, we sample the features from the nearby 9 × 9 region with dilated rate 2, which also leads to a number of 25 features for each pixel. The batch size is set as 8 for USA/Singapore and Day/Night, and 6 for the A2D2/Semantic KITTI. Due to GPU memory limitation, 30% features in each minibatch are sampled to calculate contrastive loss in the former two scenarios and 20% features are sampled in the last scenario. |