reproducibilityindex.ai

Neural Map: Structured Memory for Deep Reinforcement Learning

Authors: Emilio Parisotto, Ruslan Salakhutdinov

ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate empirically that the Neural Map surpasses previous DRL memories on a set of challenging 2D and 3D maze environments and show that it is capable of generalizing to environments that were not seen during training.
Researcher Affiliation	Academia	Emilio Parisotto & Ruslan Salakhutdinov Department of Machine Learning Carnegie Mellon University Pittsburgh, PA 15213, USA {eparisot,rsalakhu}@cs.cmu.edu
Pseudocode	No	The paper provides mathematical equations for its operations but does not include any pseudocode or algorithm blocks.
Open Source Code	No	The paper does not contain any statement about making its source code publicly available or provide any links to a code repository.
Open Datasets	No	The mazes during training are generated using a random generator. A held-out set of 1000 random mazes is kept for testing. The paper describes generating its own maze environments, and does not provide access information (link, citation, repository) to a publicly available or open dataset.
Dataset Splits	No	The mazes during training are generated using a random generator. A held-out set of 1000 random mazes is kept for testing. The paper mentions training and test sets but does not specify a distinct validation split for hyperparameter tuning, although it refers to a 'limited hyperparameter sweep'.
Hardware Specification	Yes	The authors would also like to thank NVidia NVAIL award for donating DGX-1 deep learning machine. This work used the Extreme Science and Engineering Discovery Environment (XSEDE), which is supported by National Science Foundation grant number OCI-1053575. Speciﬁcally, it used the Bridges system, which is supported by NSF award number ACI-1445606, at the Pittsburgh Supercomputing Center (PSC).
Software Dependencies	No	For optimization, all architectures used the RMSprop optimization algorithm... For optimization, all architectures used the Adam optimization algorithm... The paper mentions optimization algorithms and the Vi ZDoom API, but does not provide specific version numbers for any software dependencies.
Experiment Setup	Yes	For optimization, all architectures used the RMSprop optimization algorithm with gradients thresholded to norm 20 for LSTM, 100 for Neural Map variants, and no thresholding for memory networks. We used an auxiliary weighted entropy loss on the Synchronous Actor-Critic with weight 0.01. The learning rates for LSTM models was 0.0025, 0.005 for Neural Map variants, and 0.001 for memory networks. We used A2C with number of time steps equal to 5. We trained for 10 million updates.