reproducibilityindex.ai

BEVFusion: A Simple and Robust LiDAR-Camera Fusion Framework

Authors: Tingting Liang, Hongwei Xie, Kaicheng Yu, Zhongyu Xia, Zhiwei Lin, Yongtao Wang, Tao Tang, Bing Wang, Zhi Tang

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We empirically show that our framework surpasses the state-of-the-art methods under the normal training settings. Under the robustness training settings that simulate various Li DAR malfunctions, our framework signiﬁcantly surpasses the state-of-the-art methods by 15.7% to 28.9% m AP.
Researcher Affiliation	Collaboration	1 Wangxuan Institute of Computer Technology, Peking University, China 2 DAMO Academy, Alibaba Group, China 3 Shenzhen Campus of Sun Yat-sen University, China
Pseudocode	No	The paper does not contain a clearly labeled
Open Source Code	Yes	The code is available at https://github.com/ADLab-Auto Drive/BEVFusion.
Open Datasets	Yes	We conduct comprehensive experiments on a large-scale autonomous-driving dataset for 3D detection, nu Scenes [2].
Dataset Splits	Yes	We conduct comprehensive experiments on a large-scale autonomous-driving dataset for 3D detection, nu Scenes [2]. ... On the nu Scenes dataset, our simple framework shows great generalization ability. Following the same training settings [20, 59, 1], BEVFusion improves Point Pillars and Center Point by 18.4% and 7.1% in mean average precision (m AP) respectively, and achieves a superior performance of 69.2% m AP comparing to 68.9% m AP of Trans Fusion [1], which is considered as state-of-the-art.
Hardware Specification	No	The paper does not explicitly describe the hardware used for experiments, such as specific GPU or CPU models.
Software Dependencies	No	We implement our network in Py Torch using the open-sourced MMDetection3D [8]. The paper mentions software components but does not provide specific version numbers for them.
Experiment Setup	Yes	We set the image size to 448 800 and the voxel size following the ofﬁcial settings of the Li DAR stream [20, 59, 1]. Our training consists of two stages: i) We ﬁrst train the Li DAR stream and camera stream with multi-view image input and Li DAR point clouds input, respectively. Speciﬁcally, we train both streams following their Li DAR ofﬁcial settings in MMDetection3D [8]; ii) We then train BEVFusion for another 9 epochs that inherit weights from two trained streams.