Unleash the Potential of Image Branch for Cross-modal 3D Object Detection
Authors: Yifan Zhang, Qijian Zhang, Junhui Hou, Yixuan Yuan, Guoliang Xing
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments and ablation studies validate the effectiveness of our method. Notably, we achieved the top rank in the highly competitive cyclist class of the KITTI benchmark at the time of submission. The source code is available at https://github.com/Eaphan/UPIDet. |
| Researcher Affiliation | Academia | Yifan Zhang1, Qijian Zhang1, Junhui Hou1 , Yixuan Yuan2 , and Guoliang Xing2 1City University of Hong Kong, 2The Chinese University of Hong Kong |
| Pseudocode | No | The paper describes methods with mathematical equations and block diagrams (e.g., Figure 3), but it does not include formal pseudocode blocks or algorithms. |
| Open Source Code | Yes | The source code is available at https://github.com/Eaphan/UPIDet. |
| Open Datasets | Yes | We conducted experiments on the prevailing KITTI benchmark dataset, which contains two modalities of 3D point clouds and 2D RGB images. Following previous works [33], we divided all training data into two subsets, i.e., 3712 samples for training and the rest 3769 for validation. Besides, we also conducted experiments on the Waymo Open Dataset (WOD) [38], which can be found in Appendix A.3. |
| Dataset Splits | Yes | Following previous works [33], we divided all training data into two subsets, i.e., 3712 samples for training and the rest 3769 for validation. |
| Hardware Specification | Yes | In our experiments, the batch size was set to 8, equally distributed on 4 NVIDIA 3090 GPUs. |
| Software Dependencies | No | The paper mentions using Adam [13] for optimization and ResNet18 [10] as backbone, but does not specify version numbers for any software dependencies like PyTorch, TensorFlow, or CUDA. |
| Experiment Setup | Yes | Through the experiments on KITTI dataset, we adopted Adam [13] (β1=0.9, β2=0.99) to optimize our UPIDet. We initialized the learning rate as 0.003 and updated it with the one-cycle policy [37]. And we trained the model for a total of 80 epochs in an end-to-end manner. In our experiments, the batch size was set to 8, equally distributed on 4 NVIDIA 3090 GPUs. |