RL Unplugged: A Suite of Benchmarks for Offline Reinforcement Learning

Authors: Caglar Gulcehre, Ziyu Wang, Alexander Novikov, Thomas Paine, Sergio Gómez, Konrad Zolna, Rishabh Agarwal, Josh S. Merel, Daniel J. Mankowitz, Cosmin Paduraru, Gabriel Dulac-Arnold, Jerry Li, Mohammad Norouzi, Matthew Hoffman, Nicolas Heess, Nando de Freitas

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We propose detailed evaluation protocols for each domain in RL Unplugged and provide an extensive analysis of supervised learning and offline RL methods using these protocols. We will release data for all our tasks and open-source all algorithms presented in this paper.
Researcher Affiliation Industry D: Deep Mind G: Google Brain
Pseudocode No The main body of the paper does not contain pseudocode or algorithm blocks. It references a supplementary material for implementation details.
Open Source Code Yes We will release data for all our tasks and open-source all algorithms presented in this paper. [...] Our project page is available on github. [...] See our github project page for the details of our API (https://github.com/deepmind/deepmind-research/tree/master/rl_unplugged).
Open Datasets Yes We will release data for all our tasks and open-source all algorithms presented in this paper. [...] See our github project page for the details of our API (https://github.com/deepmind/deepmind-research/tree/master/rl_unplugged).
Dataset Splits Yes Policy Selection Training Validation Testing (from Figure 2 labels) and For each task, we clearly specify if it is intended for online vs offline policy selection.
Hardware Specification No The paper mentions 'limited computational budget' and 'large-scale distributed RL algorithms' but does not provide specific hardware details like GPU/CPU models or memory specifications.
Software Dependencies No The paper mentions popular machine learning frameworks and specific algorithms like DQN and D4PG, but does not provide specific version numbers for these software dependencies (e.g., 'PyTorch 1.9').
Experiment Setup No Detailed descriptions of the baselines and our implementations (including hyperparameters) are presented in Section A in the supplementary material. The details of the experimental protocol and the final hyperparameters are provided in the supplementary material. (These details are referred to be in the supplementary material, not the main text).