Relative Variational Intrinsic Control
Authors: Kate Baumli, David Warde-Farley, Steven Hansen, Volodymyr Mnih6732-6740
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we evaluate the skills learned by RVIC both qualitatively and quantitatively (via hierarchical reinforcement learning) on the Deep Mind Control Suite (Tassa et al. 2018) and Atari 2600 games from The Arcade Learning Environment (ALE) (Bellemare et al. 2013). |
| Researcher Affiliation | Industry | Kate Baumli, David Warde-Farley, Steven Hansen, Volodymyr Mnih Deep Mind {baumli, dwf, stevenhansen, vmnih} @google.com |
| Pseudocode | Yes | See Figure 1 and Algorithm 1 for further summary of the Relative Variational Intrinsic Control method. |
| Open Source Code | No | The paper does not contain any statement about releasing source code or a link to a code repository. |
| Open Datasets | Yes | In this section, we evaluate the skills learned by RVIC both qualitatively and quantitatively (via hierarchical reinforcement learning) on the Deep Mind Control Suite (Tassa et al. 2018) and Atari 2600 games from The Arcade Learning Environment (ALE) (Bellemare et al. 2013). |
| Dataset Splits | No | The paper mentions training and testing but does not explicitly specify dataset splits (e.g., percentages or counts for training, validation, and test sets). |
| Hardware Specification | No | The paper mentions training models but does not provide any specific hardware details such as GPU models, CPU models, or cloud computing instance types used for the experiments. |
| Software Dependencies | No | The paper mentions using 'R2D2' for training but does not provide specific version numbers for any software dependencies like Python, PyTorch, TensorFlow, or other libraries. |
| Experiment Setup | Yes | All final values used for hyperparameters can be found in Table 1 in the Appendix. Table 1: A table of hyperparameters for skill learning experiments. Network architecture and hyperparameters for HRL experiments are identical to those in R2D2... Table 2: A table of the best hyperparameters for each level of Atari in the HRL experiments. |