Learning to Adapt in Dynamic, Real-World Environments through Meta-Reinforcement Learning

Authors: Anusha Nagabandi, Ignasi Clavera, Simin Liu, Ronald S. Fearing, Pieter Abbeel, Sergey Levine, Chelsea Finn

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments demonstrate online adaptation for continuous control tasks on both simulated and real-world agents.
Researcher Affiliation Academia Anusha Nagabandi*, Ignasi Clavera*, Simin Liu, Ronald S. Fearing, Pieter Abbeel, Sergey Levine, & Chelsea Finn University of California, Berkeley {nagaban2,iclavera,simin.liu}@berkeley.edu {ronf,pabbeel,svlevine,cbfinn}@berkeley.edu
Pseudocode Yes Algorithm 1 Model-Based Meta-Reinforcement Learning (train time) and Algorithm 2 Online Model Adaptation (test time) are provided in Section 4.
Open Source Code No The paper mentions videos are available online at a project website, but does not state that source code for their method is released or provide a link to a code repository.
Open Datasets No We meta-train a dynamics model for this robot using the meta-objective described in Equation 3, and we train it to adapt on entirely real-world data from three different training terrains: carpet, styrofoam, and turf. We collect approximately 30 minutes of data from each of the three training terrains. The paper describes a custom dataset but does not provide concrete access information (link, DOI, citation) for public availability.
Dataset Splits Yes In these experiments, note that all agents were meta-trained on a distribution of tasks/environments (as detailed above), but we then evaluate their adaptation ability on unseen environments at test time.
Hardware Specification No All experiments are conducted in a motion capture room. Computation is done on an external computer... The paper does not provide specific hardware details such as CPU, GPU models, or memory.
Software Dependencies No The paper mentions the "Mu Jo Co physics engine (Todorov et al., 2012)" but does not specify its version number or any other software dependencies with version numbers.
Experiment Setup Yes Appendix E, titled "HYPERPARAMETERS", provides tables (Table 3, 4, 5) listing specific values for learning rates, epochs, K, M, batch sizes, and other training parameters for different tasks.