FGPrompt: Fine-grained Goal Prompting for Image-goal Navigation

Authors: Xinyu Sun, Peihao Chen, Jugang Fan, Jian Chen, Thomas Li, Mingkui Tan

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Compared with existing methods on the image-goal navigation benchmark, our method brings significant performance improvement on 3 benchmark datasets (i.e., Gibson, MP3D, and HM3D). Especially on Gibson, we surpass the state-of-the-art success rate by 8% with only 1/50 model size.
Researcher Affiliation Academia 1South China University of Technology, 2Pazhou Laboratory, 3Information Technology R&D Innovation Center of Peking University, 4Peking University Shenzhen Graduate School, 5Key Laboratory of Big Data and Intelligent Robot, Ministry of Education, csxinyusu@gmail.com, mingkuitan@scut.edu.cn
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code No Project Page & Videos: https://xinyusun.github.io/fgprompt-pages
Open Datasets Yes As for image-goal navigation, we use the Habitat simulator [38, 42] and train our agent on the Gibson dataset with 72 training scenes and 14 testing scenes under the standard setting. We use the training episodes provided by [29] and train our agent for 500M steps.
Dataset Splits Yes For the training dataset, there are 72 scenes in total, each scene has 9k episodes, resulting in 648k episodes. The 9k episodes in each scene are evenly divided into three levels according to the distance from the start location to the goal location: easy (1.5 3m), medium (3 5m), and hard (5 10m). For evalution on Gibson, we use two split, in which split A [8] has 14 scenes with 1.4k episodes per level and split B [3] has 14 scenes with 1k episodes per level.
Hardware Specification Yes We train our model on the Gibson dataset using 20 environments running in parallel with 8 3090 GPUs.
Software Dependencies No We use the Habitat simulator [38, 42] and train our agent on the Gibson dataset... Following previous methods [52, 28], we adopt an actor-critic network to predict state value ct and action at using st and train it end-to-end using PPO [39]. No specific version numbers for Habitat, PPO, or other software libraries are mentioned.
Experiment Setup Yes We train our agent for 500M steps. We set the total training time steps to 500M. For one episode, we set the maximum time steps to 500 when performing validation. We use the ZER reward [52] to encourage the agent to not only reach the goal position but also face the goal orientation. We set success distance ds = 1m and αs = 25.