Object Goal Navigation using Goal-Oriented Semantic Exploration
Authors: Devendra Singh Chaplot, Dhiraj Prakashchand Gandhi, Abhinav Gupta, Russ R. Salakhutdinov
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical results in visually realistic simulation environments show that the proposed model outperforms a wide range of baselines including end-to-end learning-based methods as well as modular map-based methods and led to the winning entry of the CVPR2020 Habitat Object Nav Challenge. Ablation analysis indicates that the proposed model learns semantic priors of the relative arrangement of objects in a scene, and uses them to explore efficiently. |
| Researcher Affiliation | Collaboration | 1Carnegie Mellon University, 2Facebook AI Research |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code: https://github.com/devendrachaplot/Object-Goal-Navigation |
| Open Datasets | Yes | We use the Gibson [46] and Matterport3D (MP3D) [6] datasets in the Habitat simulator [39] for our experiments. |
| Dataset Splits | Yes | For the Gibson dataset,w We use the train and val splits of Gibson tiny set for training and testing respectively as the test set is held-out for the online evaluation server. We do not use the validation set for hyper-parameter tuning. For the MP3D dataset, we use the standard train and test splits. Our training and test set consists of a total of 86 scenes (25 Gibson tiny and 61 MP3D) and 16 scenes (5 Gibson tiny and 11 MP3D), respectively. |
| Hardware Specification | No | The paper mentions "86 parallel threads" and acknowledges "NVIDIA’s GPU support" but does not specify exact GPU/CPU models, memory amounts, or other detailed computer specifications used for experiments. |
| Software Dependencies | No | The paper mentions using "Py Torch [35]" and "Mask-RCNN [18]" but does not provide specific version numbers for these software dependencies, which is required for reproducibility. |
| Experiment Setup | Yes | After one step in each thread, we perform 10 updates to the Semantic Mapping module with a batch size of 64. We use Adam optimizer with a learning rate of 0.0001. We use binary cross-entropy loss for semantic map prediction. The Goal-driven Policy samples a new goal every u = 25 timesteps. For training this policy, we use Proximal Policy Optimization (PPO) [40] with a time horizon of 20 steps, 36 mini-batches, and 4 epochs in each PPO update. Our PPO implementation is based on [25]. The reward for the policy is the decrease in distance to the nearest goal object. We use Adam optimizer with a learning rate of 0.000025, a discount factor of γ = 0.99, an entropy coefficient of 0.001, value loss coefficient of 0.5 for training the Goal-driven Policy. |