RL Unplugged: A Suite of Benchmarks for Offline Reinforcement Learning
Authors: Caglar Gulcehre, Ziyu Wang, Alexander Novikov, Thomas Paine, Sergio Gómez, Konrad Zolna, Rishabh Agarwal, Josh S. Merel, Daniel J. Mankowitz, Cosmin Paduraru, Gabriel Dulac-Arnold, Jerry Li, Mohammad Norouzi, Matthew Hoffman, Nicolas Heess, Nando de Freitas
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We propose detailed evaluation protocols for each domain in RL Unplugged and provide an extensive analysis of supervised learning and offline RL methods using these protocols. We will release data for all our tasks and open-source all algorithms presented in this paper. |
| Researcher Affiliation | Industry | D: Deep Mind G: Google Brain |
| Pseudocode | No | The main body of the paper does not contain pseudocode or algorithm blocks. It references a supplementary material for implementation details. |
| Open Source Code | Yes | We will release data for all our tasks and open-source all algorithms presented in this paper. [...] Our project page is available on github. [...] See our github project page for the details of our API (https://github.com/deepmind/deepmind-research/tree/master/rl_unplugged). |
| Open Datasets | Yes | We will release data for all our tasks and open-source all algorithms presented in this paper. [...] See our github project page for the details of our API (https://github.com/deepmind/deepmind-research/tree/master/rl_unplugged). |
| Dataset Splits | Yes | Policy Selection Training Validation Testing (from Figure 2 labels) and For each task, we clearly specify if it is intended for online vs offline policy selection. |
| Hardware Specification | No | The paper mentions 'limited computational budget' and 'large-scale distributed RL algorithms' but does not provide specific hardware details like GPU/CPU models or memory specifications. |
| Software Dependencies | No | The paper mentions popular machine learning frameworks and specific algorithms like DQN and D4PG, but does not provide specific version numbers for these software dependencies (e.g., 'PyTorch 1.9'). |
| Experiment Setup | No | Detailed descriptions of the baselines and our implementations (including hyperparameters) are presented in Section A in the supplementary material. The details of the experimental protocol and the final hyperparameters are provided in the supplementary material. (These details are referred to be in the supplementary material, not the main text). |