Exploration in Approximate Hyper-State Space for Meta Reinforcement Learning

Authors: Luisa M Zintgraf, Leo Feng, Cong Lu, Maximilian Igl, Kristian Hartikainen, Katja Hofmann, Shimon Whiteson

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We present four experiments that illustrate how and why Hyper X helps agents meta-learn good online adaptation strategies (Sec 5.1-5.3), and results on sparse Mu Jo Co Ant Goal to demonstrate that Hyper X scales well (Sec 5.4).
Researcher Affiliation Collaboration 1University of Oxford, UK. 2Mila, Universit e de Montr eal, Canada. 3Microsoft Research, Cambridge, UK.
Pseudocode Yes Algorithm 1 Hyper X Pseudo-Code
Open Source Code Yes Our source code can be found at https://github. com/lmzintgraf/hyperx.
Open Datasets No The paper describes the use of standard RL environments and custom-designed tasks (e.g., Treasure Mountain, Multi-Stage Gridworld, sparse Half Cheetah Dir, sparse Mu Jo Co Ant Goal) but does not provide specific links, DOIs, or citations with author/year for public access to the *datasets* or *sampled task distributions* used in their experiments. It describes how these environments are configured or modified rather than providing access to data.
Dataset Splits No The paper operates in a reinforcement learning setting, describing meta-training on a task distribution and evaluation on new tasks from that distribution. It does not provide explicit train/validation/test dataset splits (e.g., percentages or sample counts) as commonly seen in supervised learning contexts.
Hardware Specification No The acknowledgements mention 'computing resources provided by Compute Canada' and 'a generous equipment grant from NVIDIA', but no specific hardware details such as exact GPU/CPU models, processor types, or memory amounts are provided.
Software Dependencies No The paper provides a link to its source code but does not explicitly list software dependencies with version numbers (e.g., 'Python 3.8, PyTorch 1.9') in the main text.
Experiment Setup No The paper states 'Implementation details are given in Appendix C.' and refers to a table of hyperparameters in the appendix. However, these specific experimental setup details (e.g., hyperparameter values, training configurations) are not provided in the main body of the paper.