reproducibilityindex.ai

Mono3DVG: 3D Visual Grounding in Monocular Images

Authors: Yang Zhan, Yuan Yuan, Zhitong Xiong

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Comprehensive benchmarks and some insightful analyses are provided for Mono3DVG. Extensive comparisons and ablation studies show that our method significantly outperforms all baselines.
Researcher Affiliation	Academia	1School of Artificial Intelligence, Optics and Electronics (i OPEN), Northwestern Polytechnical University, Xi an, China 2Technical University of Munich (TUM), Munich, Germany
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	No	The dataset and code will be released.
Open Datasets	Yes	To facilitate the broad application of 3D visual grounding, we employ both manually annotated and Chat GPT to annotate a large-scale dataset based on KITTI (Geiger, Lenz, and Urtasun 2012) for Mono3DVG.
Dataset Splits	Yes	We split our dataset into 29,990, 5,735, and 5,415 expressions for train/val/test sets respectively.
Hardware Specification	Yes	We train 60 epochs with a batch size of 10 by Adam W with 10 4 learning rate and 10 4 weight decay on one GTX 3090 24-Gi B GPU.
Software Dependencies	No	The paper mentions 'Adam W' as an optimizer but does not provide specific version numbers for any software dependencies like programming languages, libraries, or frameworks.
Experiment Setup	Yes	We train 60 epochs with a batch size of 10 by Adam W with 10 4 learning rate and 10 4 weight decay on one GTX 3090 24-Gi B GPU. The learning rate decays by a factor of 10 after 40 epochs. The dropout ratio is set to 0.1.