Augmented Commonsense Knowledge for Remote Object Grounding
Authors: Bahram Mohammadi, Yicong Hong, Yuankai Qi, Qi Wu, Shirui Pan, Javen Qinfeng Shi
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results demonstrate our proposed model noticeably outperforms the baseline and archives the state-of-the-art on the REVERIE benchmark. |
| Researcher Affiliation | Academia | 1Australian Institute for Machine Learning (AIML), University of Adelaide 2Australian National University 3Macquarie University 4Griffith University |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | The source code is available at https://github.com/Bahram Mohammadi/ACK. |
| Open Datasets | Yes | Experimental results demonstrate our proposed model noticeably outperforms the baseline and archives the state-of-the-art on the REVERIE benchmark. The experiments are conducted on the REVERIE dataset and results show that our proposed approach, ACK, outperforms the state-of-the-art methods. |
| Dataset Splits | Yes | Validation Unseen Test Unseen Navigation Grounding (Table 1 header) and The ACK is merely ablated on the validation unseen split of REVERIE. |
| Hardware Specification | Yes | we only fine-tune the proposed model for 20k iterations on a single NVIDIA 3090 GPU. |
| Software Dependencies | No | The paper mentions several software components like AdamW optimizer, ViT-B/16, Faster R-CNN, ConceptNet, and CLIP model, but does not provide specific version numbers for any of them or for underlying frameworks like PyTorch or TensorFlow. |
| Experiment Setup | Yes | We use Adam W optimizer (Loshchilov and Hutter 2018) and the learning rate is 10 5 during the training. The ACK is not incorporated into pre-training tasks of DUET (Chen et al. 2022) and we only fine-tune the proposed model for 20k iterations on a single NVIDIA 3090 GPU. |