reproducibilityindex.ai

Multi-Object 3D Grounding with Dynamic Modules and Language-Informed Spatial Attention

Authors: Haomeng Zhang, Chiao-An Yang, Raymond A. Yeh

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirically, experiments show that our method outperforms the state-of-the-art methods on multi-object 3D grounding by 12.8% (absolute) and is competitive in single-object 3D grounding.
Researcher Affiliation	Academia	Department of Computer Science, Purdue University
Pseudocode	No	The paper includes architectural diagrams (Figure 1 and Figure 2) but does not contain any structured pseudocode or algorithm blocks.
Open Source Code	Yes	Code: https://github.com/haomengz/D-LISA
Open Datasets	Yes	We conduct experiments on the Multi3DRefer [52] dataset. We also compare our model with other two-stage methods on single-object grounding using the Scan Refer [8] and the Nr3D [2] datasets.
Dataset Splits	Yes	We follow the same train/val set split as the baselines [52].
Hardware Specification	Yes	We train our model on a single NVIDIA A100 GPU.
Software Dependencies	No	The paper mentions software components like 'Adam W optimizer' and 'CLIP with Vi T-B/32' but does not provide specific version numbers for these or other key software dependencies (e.g., Python, PyTorch/TensorFlow versions).
Experiment Setup	Yes	We set the batch size to 4 with the Adam W optimizer using a learning rate of 5e 4. We set the dynamic proposal loss coefficient αdyn to 5. We set the τtrain to 0.25 and search for the optimal value of τpred over {0.05, 0.1, 0.15, 0.2, 0.25} during evaluation for M3DRef-CLIP w/NMS and our model.