CRAFT: Camera-Radar 3D Object Detection with Spatio-Contextual Fusion Transformer
Authors: Youngseok Kim, Sanmin Kim, Jun Won Choi, Dongsuk Kum
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our camera-radar fusion approach achieves the state-of-the-art 41.1% m AP and 52.3% NDS on the nu Scenes test set, which is 8.7 and 10.8 points higher than the cameraonly baseline, as well as yielding competitive performance on the Li DAR method. ... 4 Experiments ... 4.1 Comparison with State-of-the-Arts ... 4.2 Ablation Studies ... 4.3 Analysis |
| Researcher Affiliation | Academia | Youngseok Kim1, Sanmin Kim1, Jun Won Choi2, Dongsuk Kum1 1Korea Advanced Institute of Science and Technology 2Hanyang University {youngseok.kim, sanmin.kim, dskum}@kaist.ac.kr, junwchoi@hanyang.ac.kr |
| Pseudocode | No | The paper describes algorithms and methods in text and figures but does not include any explicit pseudocode blocks or algorithm listings. |
| Open Source Code | No | The paper does not provide any explicit statements about open-sourcing the code or a link to a code repository. |
| Open Datasets | Yes | We evaluate our method on a large-scale and challenging nu Scenes dataset (Caesar et al. 2020). |
| Dataset Splits | Yes | which consists of 700/150/150 scenes for train/val/test set. |
| Hardware Specification | Yes | We train our models for 24 epochs with a batch size of 32, cosine annealing scheduler, and 2 × 10−4 learning rate on 4 RTX 3090 GPUs. Inference time is measured on an Intel Core i9 CPU and an RTX 3090 GPU without test time augmentation for fusion. |
| Software Dependencies | No | The paper mentions using CenterNet and DLA34 backbone, but it does not specify version numbers for these or other software libraries/frameworks. |
| Experiment Setup | Yes | We train our models for 24 epochs with a batch size of 32, cosine annealing scheduler, and 2 × 10−4 learning rate on 4 RTX 3090 GPUs. |