M-BEV: Masked BEV Perception for Robust Autonomous Driving
Authors: Siran Chen, Yue Ma, Yu Qiao, Yali Wang
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We perform extensive experiments on the popular Nu Scenes benchmark, where our framework can significantly boost 3D perception performance of the state-of-the-art models on various missing view cases, e.g., for the absence of back view, our M-BEV promotes the PETRv2 model with 10.3% m AP gain. |
| Researcher Affiliation | Academia | 1 Shenzhen Institute of Advanced Technology, Chinese Academy of Science, Shenzhen, China 2 School of Artificial Intelligence, University of Chinese Academy of Science, Beijing, China 3 Shanghai Artificial Intelligence Laboratory, Shanghai, China 4 Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen, China |
| Pseudocode | No | The paper does not contain any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper mentions following 'official implementation on open-sourced code bases' for baseline models (PETRv2, BEVStereo) but does not state that the code for their proposed M-BEV framework is open-source or provide a link. |
| Open Datasets | Yes | We conduct our experiments on the popular Nu Scenes dataset (Caesar et al. 2020). Nu Scenes is a large-scale benchmark for autonomous driving, where the data is collected from 1000 real driving scenes with around 20 seconds duration. The scenes are divided: 700 of them for training, and 150 each for validation and testing. |
| Dataset Splits | Yes | The scenes are divided: 700 of them for training, and 150 each for validation and testing. |
| Hardware Specification | Yes | We use 8 A5000 GPUs for all experiments. |
| Software Dependencies | No | The paper mentions using 'open-sourced code bases' and specific models (PETRv2, BEVStereo) but does not provide specific version numbers for software dependencies like Python, PyTorch, or CUDA. |
| Experiment Setup | Yes | The MVR module is fine-tuned for 48 epochs, the learning rate is set to 2.0 10 4. The transformer layer of decoder is four, and the hidden dimension is 512. [...] we set a weight coefficient α = 0.05 for the reconstruction loss in the fine-tuning. |