reproducibilityindex.ai

High-Throughput Synchronous Deep RL

Authors: Iou-Jen Liu, Raymond Yeh, Alexander Schwing

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our approach on Atari games and the Google Research Football environment. Compared to synchronous baselines, HTS-RL is 2 6 faster. Compared to state-of-the-art asynchronous methods, HTS-RL has competitive throughput and consistently achieves higher average episode rewards.
Researcher Affiliation	Academia	Iou-Jen Liu, Raymond A. Yeh, Alexander G. Schwing University of Illinois at Urbana-Champaign {iliu3, yeh17, aschwing}@illinois.edu
Pseudocode	No	The paper includes diagrams of system structures (Figure 1) and timelines (Figure 2) but no structured pseudocode or algorithm blocks.
Open Source Code	Yes	Code is available at https://github.com/Iou Jen Liu/HTS-RL.
Open Datasets	Yes	We evaluate the proposed approach on a subset of the Atari games [2, 3] and all 11 academy scenarios of the recently introduced Google Research Football (GFootball) [15] environment.
Dataset Splits	No	The paper mentions evaluating using 'average over the last 100 evaluation episodes' but does not specify training, validation, or test dataset splits with percentages or counts in the main text.
Hardware Specification	No	The paper mentions using 'a single machine with 4 GPUs' but does not specify the exact GPU models, CPU models, or other detailed hardware specifications used for the experiments.
Software Dependencies	No	The paper mentions using 'Torch Beast implementation' for IMPALA and 'Pytorch implementation of Kostrikov' for A2C and PPO, but it does not provide specific version numbers for these software components or any other libraries.
Experiment Setup	Yes	Following Kostrikov [14], all methods are trained for 20M environment steps in the Atari environment. For GFootball, following Kurach et al. [15], we use 5M steps. ...Please see the supplementary material for details on hyperparameter settings and model architectures.