reproducibilityindex.ai

BEVDistill: Cross-Modal BEV Distillation for Multi-View 3D Object Detection

Authors: Zehui Chen, Zhenyu Li, Shiquan Zhang, Liangji Fang, Qinhong Jiang, Feng Zhao

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments demonstrate that the proposed method outperforms current KD approaches on a highly-competitive baseline, BEVFormer, without introducing any extra cost in the inference phase. Notably, our best model achieves 59.4 NDS on the nu Scenes test leaderboard, achieving new state-of-the-arts in comparison with various image-based detectors.
Researcher Affiliation	Collaboration	Zehui Chen1, Zhenyu Li2, Shiquan Zhang3, Liangji Fang3, Qinhong Jiang3, Feng Zhao1 1 University of Science and Technology of China 2 Harbin Institute of Technology 3 Sense Time Research lovesnow@mail.ustc.edu.cn, fzhao956@ustc.edu.cn zhenyuli17@hit.edu.cn {zhangshiquan,fangliangji,jiangqinhong}@senseauto.com
Pseudocode	No	The paper does not contain a pseudocode or algorithm block.
Open Source Code	Yes	Code will be available at https://github.com/zehuichen123/BEVDistill.
Open Datasets	Yes	We conduct the experiments on the Nu Scenes dataset (Caesar et al., 2020), which is one of the most popular datasets for 3D object detection.
Dataset Splits	Yes	It consists of 700 scenes for training, 150 scenes for validation, and 150 scenes for testing.
Hardware Specification	Yes	All models are trained on 8 NVIDIA A100 GPUs.
Software Dependencies	No	The paper states 'Our codebase is built on MMDetection3D (Contributors, 2020) toolkit.' but does not provide specific version numbers for software dependencies such as PyTorch, CUDA, or the MMDetection3D toolkit itself.
Experiment Setup	Yes	During the distillation phase, the batch size is set to 1 per GPU with an initial learning rate of 2e-4. Unless otherwise speciﬁed, we train the models for 2 schedule (24 epochs) with a cyclic policy. The input image size is set to 1600 900 and the grid size of the BEV plane in BEVFormer is set to 128 128.