reproducibilityindex.ai

Gossip-based Actor-Learner Architectures for Deep Reinforcement Learning

Authors: Mahmoud Assran, Joshua Romoff, Nicolas Ballas, Joelle Pineau, Michael Rabbat

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We compare our approach on six Atari games [Machado et al., 2018] following Stooke and Abbeel [2018] with vanilla A2C, A3C and the IMPALA off-policy method [Dhariwal et al., 2017, Mnih et al., 2016, Espeholt et al., 2018]. Our main empirical ﬁndings are:
Researcher Affiliation	Collaboration	Mahmoud Assran Facebook AI Research & Department of Electrical and Computer Engineering Mc Gill University
Pseudocode	Yes	Pseudocode is provided in Algorithm 1
Open Source Code	Yes	Our implementation of GALA-A2C is publicly available at https://github.com/facebookresearch/gala.
Open Datasets	Yes	We evaluate GALA for training Deep RL agents on Atari-2600 games [Machado et al., 2018].
Dataset Splits	No	The paper mentions training across '10 random seeds' and using '10 evaluation episodes' for final policy evaluation, but does not provide specific details on how validation splits or procedures were applied for hyperparameter tuning or model selection in the main text.
Hardware Specification	Yes	Figure 3: Comparing GALA-A2C hardware utilization to that of A2C when using one NVIDIA V100 GPU and 48 Intel CPUs.
Software Dependencies	No	The paper mentions that 'All methods are implemented in Py Torch [Paszke et al., 2017]' and uses 'Torch Beast [Küttler et al., 2019]' for a baseline, but does not provide specific version numbers for PyTorch or other key software libraries used in their implementation.
Experiment Setup	Yes	We use Adam optimizer [Kingma and Ba, 2014] with learning rate 7 10^4. The discount factor is set to 0.99 and the entropy regularization to 0.01.