Multimodal Virtual Point 3D Detection
Authors: Tianwei Yin, Xingyi Zhou, Philipp Krähenbühl
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on the large-scale nu Scenes dataset show that our framework improves a strong Center Point baseline by a significant 6.6 m AP, and outperforms competing fusion approaches. |
| Researcher Affiliation | Academia | Tianwei Yin UT Austin yintianwei@utexas.edu Xingyi Zhou UT Austin zhouxy@cs.utexas.edu Philipp Krähenbühl UT Austin philkr@cs.utexas.edu |
| Pseudocode | Yes | Algorithm 1: Multi-modal Virtual Point Generation |
| Open Source Code | Yes | Code and more visualizations are available at https://tianweiy.github.io/mvp/. |
| Open Datasets | Yes | We test our model on the large-scale nu Scenes dataset [2]. |
| Dataset Splits | Yes | We follow the official dataset split to use 700, 150, 150 sequences for training, validation, and testing. |
| Hardware Specification | Yes | The training takes 2.5 days on 4 V100 GPUs with a batch size of 16 (4 frames per GPU). |
| Software Dependencies | No | The paper mentions software used ('Center Point', 'Center Net2') but does not specify version numbers for these or underlying software dependencies like Python, PyTorch, or CUDA. |
| Experiment Setup | Yes | We train the detector on the nu Scenes dataset using the SGD optimizer with a batch size of 16 and a learning rate of 0.02 for 90000 iterations. ... We train the model for 20 epochs with the Adam W [34] optimizer using the one-cycle policy [16], with a max learning rate of 3e-3 following [66]. |