CO3: Cooperative Unsupervised 3D Representation Learning for Autonomous Driving

Authors: Runjian Chen, Yao Mu, Runsen Xu, Wenqi Shao, Chenhan Jiang, Hang Xu, Yu Qiao, Zhenguo Li, Ping Luo

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we design experiments to answer the question whether CO3 learns such representation as compared to previous methods. We first provide experiment setups in Sec. 4.1 and then discuss main results in Sec. 4.2. We also conduct ablation study and qualitative visualization in Sec. 4.3 and 4.4.
Researcher Affiliation Collaboration Runjian Chen 1, Yao Mu 1, Runsen Xu 4, Wenqi Shao 2 , Chenhan Jiang 5, Hang Xu 3, Yu Qiao 2, Zhenguo Li 3, Ping Luo 1 {rjchen, muyao}@connect.hku.hk pluo.lhi@gmail.com {shaowenqi, qiaoyu}@pjlab.org.cn li.zhenguo@huawei.com xbjxh@live.com runsenxu@connect.cuhk.edu.hk cjiangao@connect.hkust.hk 1 The University of Hong Kong 2 Shanghai AI Laboratory 3 Huawei Noah s Ark Lab 4 The Chinese University of Hong Kong 5 Hong Kong University of Science and Technology
Pseudocode Yes Algorithm 1 Implementation of Contextual Shape Computation in Python Style.
Open Source Code No Codes and models will be released here.
Open Datasets Yes We utilize the recently released vehicle-infrastructure-cooperation dataset called DAIR-V2X (Yu et al., 2022) to pre-train the 3D encoder.
Dataset Splits No No explicit train/validation/test splits with percentages or counts are provided. The paper refers to "common practice" for dataset usage and mentions training schedules (e.g., "1/8 training schedule") without specifying the exact dataset splits.
Hardware Specification No No specific hardware details like exact GPU/CPU models or processor types are provided. The paper only mentions "GPU numbers 8 4 4" and "different types of GPUs" in the implementation details without further specification.
Software Dependencies No The paper mentions "PyTorch (Paszke et al., 2019)" but does not specify its version number. Other software components like "Sparse-Convolution" and codebases like "MMDetection3D" and "Open PCDet" are mentioned without specific version numbers.
Experiment Setup Yes Implementation Details of CO3. We use Sparse-Convolution as the 3D encoder... We set the number of feature channels denc = 64, the temperature parameter in contrastive learning τ = 0.07, the dimension of common feature space of vehicle-side and fusion point clouds dz = 256 and the sample number in cooperative contrastive loss N1 = 2048. For contextual shape prediction, we set the number of bins Nbin = 32, the sample number N2 = 2048 and the weighting constant w = 10. The threshold for ground point filtering is zthd = 1.6m. We empirically find that freezing the parameters of MLP2 brings better results in detection task thus we fix them. (Table 8): Configuration Pre-training KITTI Once optimizer Adam W Adam Adam base learning rate 0.0001 0.003 0.003 weight decay 0.01 0.01 0.01 batch size 16 learning rate schedule cyclic cyclic cyclic GPU numbers 8 4 4 training epochs 20 80 80