Augmented Commonsense Knowledge for Remote Object Grounding

Authors: Bahram Mohammadi, Yicong Hong, Yuankai Qi, Qi Wu, Shirui Pan, Javen Qinfeng Shi

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results demonstrate our proposed model noticeably outperforms the baseline and archives the state-of-the-art on the REVERIE benchmark.
Researcher Affiliation Academia 1Australian Institute for Machine Learning (AIML), University of Adelaide 2Australian National University 3Macquarie University 4Griffith University
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes The source code is available at https://github.com/Bahram Mohammadi/ACK.
Open Datasets Yes Experimental results demonstrate our proposed model noticeably outperforms the baseline and archives the state-of-the-art on the REVERIE benchmark. The experiments are conducted on the REVERIE dataset and results show that our proposed approach, ACK, outperforms the state-of-the-art methods.
Dataset Splits Yes Validation Unseen Test Unseen Navigation Grounding (Table 1 header) and The ACK is merely ablated on the validation unseen split of REVERIE.
Hardware Specification Yes we only fine-tune the proposed model for 20k iterations on a single NVIDIA 3090 GPU.
Software Dependencies No The paper mentions several software components like AdamW optimizer, ViT-B/16, Faster R-CNN, ConceptNet, and CLIP model, but does not provide specific version numbers for any of them or for underlying frameworks like PyTorch or TensorFlow.
Experiment Setup Yes We use Adam W optimizer (Loshchilov and Hutter 2018) and the learning rate is 10 5 during the training. The ACK is not incorporated into pre-training tasks of DUET (Chen et al. 2022) and we only fine-tune the proposed model for 20k iterations on a single NVIDIA 3090 GPU.