reproducibilityindex.ai

Text-Guided Graph Neural Networks for Referring 3D Instance Segmentation

Authors: Pin-Hao Huang, Han-Hung Lee, Hwann-Tzong Chen, Tyng-Luh Liu1610-1618

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our method achieves state-of-the-art performance on referring 3D instance segmentation and 3D localization on Scan Refer, Nr3D, and Sr3D benchmarks, respectively.
Researcher Affiliation	Collaboration	1 Institute of Information Science, Academia Sinica, Taiwan 2 Department of Computer Science, National Tsing Hua University, Taiwan 3 Taiwan AI Labs 4 Aeolus Robotics
Pseudocode	Yes	Algorithm 1 Sequential Re-sampling for Instance Masks
Open Source Code	No	The paper does not explicitly state that source code for their method is provided or publicly available.
Open Datasets	Yes	We evaluate our method using recent 3D referring datasets including Scan Refer (Chen, Chang, and Nießner 2020) and Nr3D/Sr3D of Refer It3D (Achlioptas et al. 2020). The datasets are based on Scan Netv2 (Dai et al. 2017), which contains 1,513 richly-annotated 3D reconstructions of indoor scenes.
Dataset Splits	Yes	These datasets all follow the ofﬁcial Scan Net splits.
Hardware Specification	No	The paper mentions 'pre-train a sparse 3D UNet feature extractor' but does not specify any hardware details like CPU, GPU models, or memory.
Software Dependencies	No	The paper mentions various models and networks (GloVE, GRU, BERT, MLP, Sparse 3D UNet) but does not provide specific version numbers for any software or libraries.
Experiment Setup	Yes	For the experiments using GRU as the language extractor, we use a batch size of 8 and an initial learning rate of 0.001 with decay of 0.1 every 100 epochs. The maximum timestep and sentence length for GRU are set to 80. For the experiments with BERT (Vaswani et al. 2017; Devlin et al. 2018), the weights of the BERT model and TGNN are updated separately. The initial learning rate is 0.0002 for BERT with decay of 0.5 every 10 epochs, while the initial learning rate is 0.001 for TGNN with decay of 0.5 every 50 epochs. The batch size is 16, and the maximum sentence length is 80 as in GRU. The number of nearest-neighbors is 16 unless speciﬁed. The number of layers in the GNN is set to 3.