Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Multi-Object 3D Grounding with Dynamic Modules and Language-Informed Spatial Attention

Authors: Haomeng Zhang, Chiao-An Yang, Raymond A. Yeh

NeurIPS 2024 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirically, experiments show that our method outperforms the state-of-the-art methods on multi-object 3D grounding by 12.8% (absolute) and is competitive in single-object 3D grounding.
Researcher Affiliation Academia Department of Computer Science, Purdue University
Pseudocode No The paper includes architectural diagrams (Figure 1 and Figure 2) but does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes Code: https://github.com/haomengz/D-LISA
Open Datasets Yes We conduct experiments on the Multi3DRefer [52] dataset. We also compare our model with other two-stage methods on single-object grounding using the Scan Refer [8] and the Nr3D [2] datasets.
Dataset Splits Yes We follow the same train/val set split as the baselines [52].
Hardware Specification Yes We train our model on a single NVIDIA A100 GPU.
Software Dependencies No The paper mentions software components like 'Adam W optimizer' and 'CLIP with Vi T-B/32' but does not provide specific version numbers for these or other key software dependencies (e.g., Python, PyTorch/TensorFlow versions).
Experiment Setup Yes We set the batch size to 4 with the Adam W optimizer using a learning rate of 5e 4. We set the dynamic proposal loss coefficient αdyn to 5. We set the τtrain to 0.25 and search for the optimal value of τpred over {0.05, 0.1, 0.15, 0.2, 0.25} during evaluation for M3DRef-CLIP w/NMS and our model.