Learning To Explore Using Active Neural SLAM
Authors: Devendra Singh Chaplot, Dhiraj Gandhi, Saurabh Gupta, Abhinav Gupta, Ruslan Salakhutdinov
ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments in visually and physically realistic simulated 3D environments demonstrate the effectiveness of our approach over past learning and geometry-based approaches. |
| Researcher Affiliation | Collaboration | 1Carnegie Mellon University, 2Facebook AI Research, 3UIUC |
| Pseudocode | No | The paper provides architectural descriptions and mathematical formulations but does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code: https://github.com/devendrachaplot/Neural-SLAM ... We have released the noise models, along with their implementation in the Habitat simulator in the open-source code. |
| Open Datasets | Yes | We use the Habitat simulator (Savva et al., 2019) with the Gibson (Xia et al., 2018) and Matterport (MP3D) (Chang et al., 2017) datasets for our experiments. |
| Dataset Splits | Yes | We use train/val/test splits provided by Savva et al. (2019) for both the datasets. Note that the set of scenes used in each split is disjoint, which means the agent is tested on new scenes never seen during training. Gibson test set is not public but rather held out on an online evaluation server for the Pointgoal task. We use the validation as the test set for comparison and analysis for the Gibson domain. We do not use the validation set for hyper-parameter tuning. |
| Hardware Specification | No | The paper mentions using a 'Lo Co Bot' and 'Hokuyo UST-10LX Scanning Laser Rangefinder (Li DAR)' for collecting noise model data and real-world transfer. It also acknowledges 'NVIDIA’s GPU support'. However, it does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts) used for running the primary experiments (training and evaluation in the Habitat simulator). |
| Software Dependencies | No | The paper mentions key software components like 'Py Torch', 'pyrobot API', 'ROS', and refers to a 'PPO implementation based on Kostrikov (2018)'. However, it does not provide specific version numbers for these software dependencies, which are required for reproducibility. |
| Experiment Setup | Yes | We train all the components with 72 parallel threads, with each thread using one of the 72 scenes in the Gibson training set. We maintain a FIFO memory of size 500000 for training the Neural SLAM module. After one step in all the environments (i.e. every 72 steps) we perform 10 updates to the Neural SLAM module with a batch size of 72. We use Adam optimizer with a learning rate of 0.0001. The obstacle map and explored area loss coefficients are 1 and the pose loss coefficient is 10000 (as MSE loss in meters and radians is much smaller). The Global policy samples a new goal every 25 timesteps. We use Proximal Policy Optimization (PPO) (Schulman et al., 2017) for training the Global policy. ... We use Adam optimizer with a learning rate of 0.000025, a discount factor of γ = 0.99, an entropy coefficient of 0.001, value loss coefficient of 0.5 for training the Global Policy. |