Statistics and Samples in Distributional Reinforcement Learning

Authors: Mark Rowland, Robert Dadashi, Saurabh Kumar, Remi Munos, Marc G. Bellemare, Will Dabney

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We compare EDRL with existing methods on a variety of MDPs to illustrate concrete aspects of our analysis, and develop a deep RL variant of the algorithm, ER-DQN, which we evaluate on the Atari-57 suite of games.
Researcher Affiliation Industry 1Deep Mind 2Google Brain. Correspondence to: Mark Rowland <markrowland@google.com>.
Pseudocode Yes Algorithm 1 Generic DRL update algorithm. Algorithm 2 Stochastic EDRL update algorithm.
Open Source Code No The paper does not provide any statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets Yes We evaluate ER-DQN on the Arcade Learning Environment (Bellemare et al., 2013).
Dataset Splits No The paper mentions evaluating on the Atari-57 suite but does not specify concrete train/validation/test dataset splits (e.g., percentages or sample counts) needed for reproduction.
Hardware Specification No The paper does not specify any particular hardware components such as GPU/CPU models, memory, or cloud computing instance types used for running experiments.
Software Dependencies No The paper mentions using a 'Sci Py optimisation routine', but does not provide a specific version number for SciPy or any other software dependencies.
Experiment Setup Yes Precise experimental details and results are given in Appendix Section D.