Mx2M: Masked Cross-Modality Modeling in Domain Adaptation for 3D Semantic Segmentation

Authors: Boxiang Zhang, Zunran Wang, Yonggen Ling, Yuanyuan Guan, Shenghao Zhang, Wenhui Li

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Evaluation of the Mx2M on three DA scenarios, including Day/Night, USA/Singapore, and A2D2/Semantic KITTI, brings large improvements over previous methods on many metrics.
Researcher Affiliation Collaboration 1 College of Computer Science and Technology, Jilin University, Changchun, China 2 Robotics X, Tencent, Shenzhen, China
Pseudocode No The paper describes the method using text and diagrams (Figure 2 and 3) but does not include any structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide a direct link to a source code repository or an explicit statement about releasing the code for the described methodology.
Open Datasets Yes Three autonomous driving datasets are chosen, including nu Scenes (Caesar et al. 2020), A2D2 (Geyer et al. 2019), and Semantic KITTI (Behley et al. 2019), where Li DAR and camera are synchronized and calibrated.
Dataset Splits No The paper defines source and target datasets for domain adaptation scenarios but does not explicitly provide the specific train/validation/test dataset splits (e.g., percentages or sample counts) used for reproduction.
Hardware Specification Yes We use the Py Torch 1.7.1 framework on an NVIDIA Tesla V100 GPU card with 32GB RAM under CUDA 11.0 and cu DNN 8.0.5.
Software Dependencies Yes We use the Py Torch 1.7.1 framework on an NVIDIA Tesla V100 GPU card with 32GB RAM under CUDA 11.0 and cu DNN 8.0.5.
Experiment Setup Yes For nu Scenes, the mini-batch Adam (Kingma and Ba 2015) is configured as the batch size of 8, β1 of 0.9, and β2 of 0.999. All models are trained for 100k iterations with the initial learning rate of 1e-3, which is then divided by 10 at the 80k and again at the 90k iteration. For the A2D2/Semantic KITTI, the batch size is set as 4, while related models are trained for 200k and so do on other configurations, which is caused by the limited memory.