Leveraging Vision-Centric Multi-Modal Expertise for 3D Object Detection
Authors: Linyan Huang, Zhiqi Li, Chonghao Sima, Wenhai Wang, Jingdong Wang, Yu Qiao, Hongyang Li
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct our experiments on the nu Scenes dataset [3], a widely used benchmark for autonomous driving tasks. ... As presented in Tab. 1, the performance of VCD-A surpasses other cutting-edge methods, achieving a record of 44.6% and 56.6% on the nu Scenes benchmark. This provides robust evidence of the effectiveness of our approach. ... To verify the effectiveness and necessity of each component, we conduct various ablation experiments on the nu Scenes validation set. |
| Researcher Affiliation | Collaboration | 1Shanghai AI Lab 2Nanjing University 3CUHK 4Baidu |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks with step-by-step instructions. Figure 2 is an 'Algorithm Overview' diagram, not pseudocode. |
| Open Source Code | Yes | The code will be released at https://github.com/Open Drive Lab/Birds-eye-view-Perception. |
| Open Datasets | Yes | We conduct our experiments on the nu Scenes dataset [3], a widely used benchmark for autonomous driving tasks. [3] Holger Caesar, Varun Bankiti, Alex H Lang, Sourabh Vora, Venice Erin Liong, Qiang Xu, Anush Krishnan, Yu Pan, Giancarlo Baldan, and Oscar Beijbom. nu Scenes: A multimodal dataset for autonomous driving. In CVPR, 2020. |
| Dataset Splits | Yes | The dataset comprises 700 training scenes, 150 validation scenes, and 150 testing scenes. |
| Hardware Specification | Yes | Main experiments are trained on 8 NVIDIA A100 GPUs, while ablation experiments are conducted on 8 NVIDIA V100 GPUS. |
| Software Dependencies | No | The paper mentions that 'The codebase is developed upon MMDetection3D [13]' but does not provide specific version numbers for MMDetection3D or any other software dependencies like Python, PyTorch, or CUDA. |
| Experiment Setup | Yes | For BEVDepth, the model is trained for 20 epochs with an initial learning rate of 2e-4. In the distillation process, the per-GPU batch size is set to 4, whereas during the training of the baseline model, it is set to 8. |