GAMap: Zero-Shot Object Goal Navigation with Multi-Scale Geometric-Affordance Guidance
Authors: shuaihang yuan, Hao Huang, Yu Hao, Congcong Wen, Anthony Tzes, Yi Fang
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Comprehensive experiments conducted on HM3D and Gibson benchmark datasets demonstrate improvements in Success Rate and Success weighted by Path Length, underscoring the efficacy of our geometricpart and affordance-guided navigation approach in enhancing robot autonomy and versatility, without any additional object-specific training or fine-tuning with the semantics of unseen objects and/or the locomotions of the robot. |
| Researcher Affiliation | Academia | Shuaihang Yuan 1,2,4, Hao Huang 2,4, Yu Hao2,3,4, Congcong Wen2,4 Anthony Tzes1,2, Yi Fang 1,2,3,4 1NYUAD Center for Artificial Intelligence and Robotics (CAIR), Abu Dhabi, UAE. 2New York University Abu Dhabi, Electrical Engineering, Abu Dhabi 129188, UAE. 3New York University, Electrical & Computer Engineering Dept., Brooklyn, NY 11201, USA. 4Embodied AI and Robotics (AIR) Lab, NYU Abu Dhabi, UAE. |
| Pseudocode | No | The paper includes pipeline diagrams but no explicit pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our project is available at https://shalexyuan.github.io/GAMap/. |
| Open Datasets | Yes | Datasets. HM3D [23] is a dataset... Gibson [24] was developed by Al-Halah et al. [43]. |
| Dataset Splits | Yes | We follow the validation settings from [3, 32] to evaluate our proposed method. ... We use 2000 episodes on the validation split of HM3D to report the results. Similarly, we follow this method [19]to produce the results on the Gibson dataset. |
| Hardware Specification | Yes | We use a Titan XP GPU for the experiment evaluation, and the entire evaluation process takes around 44 hours. |
| Software Dependencies | No | The paper mentions using CLIP and GPT-4V but does not provide specific version numbers for these or other software components. |
| Experiment Setup | Yes | In our experiment... We set Na to 1 and Ng to 3 for the experiments on the HM3D and Gibson datasets. For the partition process, we use three scaling levels in all our experiments: the first level is the original image, the second level has 4 equal-sized patches, and the third level has 16 equal-sized patches. We use CLIP as the pre-trained visual and text encoder. |