Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
BEVDistill: Cross-Modal BEV Distillation for Multi-View 3D Object Detection
Authors: Zehui Chen, Zhenyu Li, Shiquan Zhang, Liangji Fang, Qinhong Jiang, Feng Zhao
ICLR 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments demonstrate that the proposed method outperforms current KD approaches on a highly-competitive baseline, BEVFormer, without introducing any extra cost in the inference phase. Notably, our best model achieves 59.4 NDS on the nu Scenes test leaderboard, achieving new state-of-the-arts in comparison with various image-based detectors. |
| Researcher Affiliation | Collaboration | Zehui Chen1, Zhenyu Li2, Shiquan Zhang3, Liangji Fang3, Qinhong Jiang3, Feng Zhao1 1 University of Science and Technology of China 2 Harbin Institute of Technology 3 Sense Time Research EMAIL, EMAIL EMAIL EMAIL |
| Pseudocode | No | The paper does not contain a pseudocode or algorithm block. |
| Open Source Code | Yes | Code will be available at https://github.com/zehuichen123/BEVDistill. |
| Open Datasets | Yes | We conduct the experiments on the Nu Scenes dataset (Caesar et al., 2020), which is one of the most popular datasets for 3D object detection. |
| Dataset Splits | Yes | It consists of 700 scenes for training, 150 scenes for validation, and 150 scenes for testing. |
| Hardware Specification | Yes | All models are trained on 8 NVIDIA A100 GPUs. |
| Software Dependencies | No | The paper states 'Our codebase is built on MMDetection3D (Contributors, 2020) toolkit.' but does not provide specific version numbers for software dependencies such as PyTorch, CUDA, or the MMDetection3D toolkit itself. |
| Experiment Setup | Yes | During the distillation phase, the batch size is set to 1 per GPU with an initial learning rate of 2e-4. Unless otherwise specified, we train the models for 2 schedule (24 epochs) with a cyclic policy. The input image size is set to 1600 900 and the grid size of the BEV plane in BEVFormer is set to 128 128. |