reproducibilityindex.ai

Learning to Navigate in Complex Environments

Authors: Piotr Mirowski, Razvan Pascanu, Fabio Viola, Hubert Soyer, Andy Ballard, Andrea Banino, Misha Denil, Ross Goroshin, Laurent Sifre, Koray Kavukcuoglu, Dharshan Kumaran, Raia Hadsell

ICLR 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our approach using ﬁve 3D maze environments and demonstrate the accelerated learning and increased performance of the proposed agent architecture. These environments feature complex geometry, random start position and orientation, dynamic goal locations, and long episodes that require thousands of agent steps (see Figure 1).
Researcher Affiliation	Industry	Piotr Mirowski , Razvan Pascanu , Fabio Viola, Hubert Soyer, Andrew J. Ballard, Andrea Banino, Misha Denil, Ross Goroshin, Laurent Sifre, Koray Kavukcuoglu, Dharshan Kumaran, Raia Hadsell Deep Mind London, UK {piotrmirowski, razp, fviola, soyer, aybd, abanino, mdenil, goroshin, sifre, korayk, dkumaran, raia} @google.com
Pseudocode	No	The paper describes network architectures and training details but does not include any explicit pseudocode or algorithm blocks.
Open Source Code	No	The paper states: 'The environments used in this paper are publicly available at https://github.com/deepmind/lab.' This refers to the environment, not the source code for the methodology/agent described in the paper.
Open Datasets	Yes	We consider a set of ﬁrst-person 3D mazes from the Deep Mind Lab environment (Beattie et al., 2016) (see Fig. 1)... The environments used in this paper are publicly available at https://github.com/deepmind/lab.
Dataset Splits	No	The paper describes environment dimensions and episode lengths but does not explicitly provide training/validation/test dataset splits. It mentions '100 test episodes' but no formal split percentages or sample counts for training and validation.
Hardware Specification	No	The paper mentions 'We use 16 workers' but does not specify any particular hardware (GPU, CPU, etc.) used for running the experiments.
Software Dependencies	No	The paper mentions algorithms like A3C and RMSProp but does not provide specific software dependencies with version numbers (e.g., PyTorch 1.9, TensorFlow 2.x, CUDA 11.x).
Experiment Setup	Yes	Learning rate was sampled from [10-4, 5 10-4]. Strength of the entropy regularization from [10-4, 10-3]. ... Gradients are computed over non-overlaping chunks of 50 or 75 steps of the episode. The auxiliary tasks, when used, have hyperparameters sampled from: Coefﬁcient βd of the depth prediction loss from convnet features Ld sampled from {3.33, 10, 33}. Coefﬁcient β d of the depth prediction loss from LSTM hiddens Ld sampled from {1, 3.33, 10}. Coefﬁcient βl of the loop closure prediction loss Ll sampled from {1, 3.33, 10}.