Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
RoboFusion: Towards Robust Multi-Modal 3D Object Detection via SAM
Authors: Ziying Song, Guoxing Zhang, Lin Liu, Lei Yang, Shaoqing Xu, Caiyan Jia, Feiyang Jia, Li Wang
IJCAI 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Consequently, Robo Fusion achieves SOTA performance in noisy scenarios, as demonstrated by the KITTI-C and nu Scenes C benchmarks. ... We validate Robo Fusion s robustness against OOD noise scenarios in KITTI-C and nu Scenes-C datasets [Dong et al., 2023], achieving SOTA performance amid noise, as shown in Fig. 1. |
| Researcher Affiliation | Academia | Ziying Song1,2 , Guoxing Zhang3 , Lin Liu1,2 , Lei Yang4 , Shaoqing Xu5 , Caiyan Jia1,2 , Feiyang Jia1,2 , Li Wang6 1School of Computer Science and Technology, Beijing Jiaotong University, China 2 Beijing Key Lab of Traffic Data Analysis and Mining, China 3Hebei University of Science and Technology, China 4Tsinghua University, China 5University of Macau, China 6Beijing Institute of Technology, China EMAIL |
| Pseudocode | No | The paper contains figures and descriptions of the framework, but no structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code is available at https://github. com/adept-thu/Robo Fusion. |
| Open Datasets | Yes | We perform experiments on both the clean public benchmarks (KITTI [Geiger et al., 2012] and nu Scenes [Caesar et al., 2020]) and the noisy public benchmarks (KITTI-C[Dong et al., 2023] and nu Scenes-C [Dong et al., 2023]). |
| Dataset Splits | Yes | The KITTI dataset provides synchronized Li DAR point clouds and front-view camera images, consists of 3,712 training samples, 3,769 validation samples and 7,518 test samples. The nu Scenes dataset is a large-scale 3D detection benchmark consisting of 700 training scenes, 150 validation scenes, and 150 testing scenes. |
| Hardware Specification | Yes | To enable effective training on the KITTI and nu Scenes datasets, we utilize 8 NVIDIA A100 GPUs for network training. Additionally, the runtime is evaluated on an NVIDIA A100 GPU. |
| Software Dependencies | No | The paper mentions 'Adam optimizer' and 'Open PCDet' but does not specify version numbers for any software dependencies. |
| Experiment Setup | Yes | Specifically, for KITTI, our Robo Fusion based on Focals Conv[Chen et al., 2022] involves training for 80 epochs. For nu Scenes, our Robo Fusion based on Trans Fusion [Bai et al., 2022] has 20 epochs of training. During the model inference stage, we employ a non-maximal suppression (NMS) operation in the Region Proposal Network (RPN) with an Io U threshold of 0.7. We select the top 100 region proposals to serve as inputs for the detection head. After refinement, we apply NMS again with an Io U threshold of 0.1 to eliminate redundant predictions. |