Object Goal Navigation using Goal-Oriented Semantic Exploration

Authors: Devendra Singh Chaplot, Dhiraj Prakashchand Gandhi, Abhinav Gupta, Russ R. Salakhutdinov

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirical results in visually realistic simulation environments show that the proposed model outperforms a wide range of baselines including end-to-end learning-based methods as well as modular map-based methods and led to the winning entry of the CVPR2020 Habitat Object Nav Challenge. Ablation analysis indicates that the proposed model learns semantic priors of the relative arrangement of objects in a scene, and uses them to explore efficiently.
Researcher Affiliation Collaboration 1Carnegie Mellon University, 2Facebook AI Research
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes Code: https://github.com/devendrachaplot/Object-Goal-Navigation
Open Datasets Yes We use the Gibson [46] and Matterport3D (MP3D) [6] datasets in the Habitat simulator [39] for our experiments.
Dataset Splits Yes For the Gibson dataset,w We use the train and val splits of Gibson tiny set for training and testing respectively as the test set is held-out for the online evaluation server. We do not use the validation set for hyper-parameter tuning. For the MP3D dataset, we use the standard train and test splits. Our training and test set consists of a total of 86 scenes (25 Gibson tiny and 61 MP3D) and 16 scenes (5 Gibson tiny and 11 MP3D), respectively.
Hardware Specification No The paper mentions "86 parallel threads" and acknowledges "NVIDIA’s GPU support" but does not specify exact GPU/CPU models, memory amounts, or other detailed computer specifications used for experiments.
Software Dependencies No The paper mentions using "Py Torch [35]" and "Mask-RCNN [18]" but does not provide specific version numbers for these software dependencies, which is required for reproducibility.
Experiment Setup Yes After one step in each thread, we perform 10 updates to the Semantic Mapping module with a batch size of 64. We use Adam optimizer with a learning rate of 0.0001. We use binary cross-entropy loss for semantic map prediction. The Goal-driven Policy samples a new goal every u = 25 timesteps. For training this policy, we use Proximal Policy Optimization (PPO) [40] with a time horizon of 20 steps, 36 mini-batches, and 4 epochs in each PPO update. Our PPO implementation is based on [25]. The reward for the policy is the decrease in distance to the nearest goal object. We use Adam optimizer with a learning rate of 0.000025, a discount factor of γ = 0.99, an entropy coefficient of 0.001, value loss coefficient of 0.5 for training the Goal-driven Policy.