Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

OpenBox: Annotate Any Bounding Boxes in 3D

Authors: In-Jae Lee, Mungyeom Kim, Kwonyoung Ryu, Pierre Musacchio, Jaesik Park

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on the Waymo Open Dataset (WOD), the Lyft Level 5 Perception dataset, and the nu Scenes dataset demonstrate improved accuracy and efficiency over baselines. Our project page is available at: https://oliver0922.github.io/Open Box/.
Researcher Affiliation	Academia	1Seoul National University 2POSTECH
Pseudocode	No	The paper describes the pipeline overview in Figure 2 and detailed steps in Section 3, but does not provide any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	Our code is based on Open PCDet [35] and MMDetection3D [5]. Additional details on training, hyperparameters, and network architecture are provided in Appendix B. ... Code will be released.
Open Datasets	Yes	We conduct experiments on Waymo Open Dataset (WOD) [33], Lyft Level 5 Perception Dataset (Lyft) [12], and nu Scenes [2].
Dataset Splits	Yes	We conduct experiments on Waymo Open Dataset (WOD) [33] validation set, Lyft Level 5 Perception Dataset (Lyft) [12] and nu Scenes [2]. ... Table 1: 3D object-detection results on the WOD [33] validation set. ... Table 2: 3D object-detection results on the Lyft [12] validation set. ... Table 3: Annotation performance on Lyft [12] training dataset. ... Table 4: 3D object-detection results on the nu Scenes [2] validation set.
Hardware Specification	Yes	We train models [6, 32, 42] on 8 NVIDIA A6000 GPUs (48GB) and 2 AMD EPYC 7763 CPUs.
Software Dependencies	No	Our code is based on Open PCDet [35] and MMDetection3D [5]. ... We also employ VDBFusion [36] for SDF.
Experiment Setup	Yes	Table 6: Training and network details for experiment configs Voxel R-CNN [6] Point RCNN [32] Center Point [42] optimizer Adam W Adam W Adam W base learning rate 1e-2 1e-2 1e-4 weight decay 1e-3 1e-2 1e-2 momentum 0.9 1e-2 momentum range [0.95, 0.85] [0.95, 0.85] learning rate decay 0.1 0.1 learning rate clip 1e-7 1e-7 gradient norm clip 10 10 35 batch size 16 2 32 epoch 20 60 20