Implicit Quantile Networks for Distributional Reinforcement Learning
Authors: Will Dabney, Georg Ostrovski, David Silver, Remi Munos
ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate improved performance on the 57 Atari 2600 games in the ALE, and use our algorithm s implicitly deļ¬ned distributions to study the effects of risk-sensitive policies in Atari games. |
| Researcher Affiliation | Industry | 1Deep Mind, London, UK. Correspondence to: Will Dabney <wdabney@google.com>, Georg Ostrovski <ostrovski@google.com>. |
| Pseudocode | No | The paper describes algorithms and mathematical formulations but does not include a distinct block labeled "Pseudocode" or "Algorithm". |
| Open Source Code | No | The paper does not contain any explicit statements about releasing source code or links to a code repository. |
| Open Datasets | Yes | C51 outperformed all previous improvements to DQN on a set of 57 Atari 2600 games in the Arcade Learning Environment (Bellemare et al., 2013), which we refer to as the Atari-57 benchmark. |
| Dataset Splits | No | The paper references the Atari-57 benchmark but does not specify any particular training, validation, or test splits (e.g., percentages, sample counts) that would be needed for reproduction beyond the inherent structure of the benchmark itself. While the benchmark implies a testing setup, explicit splits are not described. |
| Hardware Specification | No | The paper does not specify any particular GPU models, CPU types, or other hardware specifications used for running the experiments. |
| Software Dependencies | No | The paper does not list specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | for embedding dimension n = 64:" and "we varied N, N {1, 8, 32, 64}." and "fixed it at K = 32 for all experiments." |