Learning to Navigate in Cities Without a Map

Authors: Piotr Mirowski, Matt Grimes, Mateusz Malinowski, Karl Moritz Hermann, Keith Anderson, Denis Teplyashin, Karen Simonyan, koray kavukcuoglu, Andrew Zisserman, Raia Hadsell

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Building upon recent research that applies deep reinforcement learning to maze navigation problems, we present an end-to-end deep reinforcement learning approach that can be applied on a city scale. Our baselines demonstrate that deep reinforcement learning agents can learn to navigate in multiple cities and to traverse to target destinations that may be kilometres away.
Researcher Affiliation Industry Piotr Mirowski, Matthew Koichi Grimes, Mateusz Malinowski, Karl Moritz Hermann, Keith Anderson, Denis Teplyashin, Karen Simonyan, Koray Kavukcuoglu, Andrew Zisserman, Raia Hadsell Deep Mind London, United Kingdom {piotrmirowski, mkg, mateuszm, kmh, keithanderson, }@google.com {teplyashin, simonyan, korayk, zisserman, raia}@google.com
Pseudocode No The paper describes the algorithms and architectures in detail within the text and using diagrams (Figure 2), but it does not include formal pseudocode blocks or algorithms labeled as such.
Open Source Code Yes The project webpage http://streetlearn.cc contains a video summarizing our research and showing the trained agent in diverse city environments and on the transfer task, the form to request the Street Learn dataset and links to further resources. The Street Learn environment code is available at https://github.com/deepmind/streetlearn.
Open Datasets Yes A multi-city version of the Street View based RL environment, with carefully processed images provided by Google Street View (i.e., blurred faces and license plates, with a mechanism for enforcing image takedown requests) has been released for Manhattan and Pittsburgh and is accessible from http:// streetlearn.cc and https://github.com/deepmind/streetlearn.
Dataset Splits No The paper mentions masking 25% of possible goals for generalization testing ('We mask 25% of the possible goals and train on the remaining ones'). However, it does not provide specific percentages or counts for distinct training, validation, and test splits typically found in machine learning experiments, nor does it refer to standard predefined splits for reproducibility.
Hardware Specification No The paper mentions using IMPALA for training and specifying the number of actors and batch sizes, but it does not provide specific hardware details such as CPU/GPU models, memory, or cloud computing instance types used for the experiments.
Software Dependencies No The paper mentions using IMPALA, an actor-critic implementation, but does not provide version numbers for IMPALA or any other software dependencies, libraries, or programming languages used in their experimental setup.
Experiment Setup Yes To train the agents, we use IMPALA [18], an actor-critic implementation that decouples acting and learning. We use 256 actors for City Nav and 512 actors for Multi City Nav, with batch sizes of 256 or 512 respectively, and sequences are unrolled to length 50. We start by sampling each new goal to be within 500m of the agent’s position (phase 1). In phase 2, we progressively grow the maximum range of allowed destinations to cover the full graph (3.5km in the smaller New York areas, or 5km for central London or Paris).