reproducibilityindex.ai

Deep Online Learning Via Meta-Learning: Continual Adaptation for Model-Based RL

Authors: Anusha Nagabandi, Chelsea Finn, Sergey Levine

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	we apply our meta-learning for online learning (MOLe) approach to model-based reinforcement learning, where adapting the predictive model is critical for control; we demonstrate that MOLe outperforms alternative prior methods, and enables effective continuous adaptation in non-stationary task distributions such as varying terrains, motor failures, and unexpected disturbances.
Researcher Affiliation	Academia	Anusha Nagabandi, Chelsea Finn & Sergey Levine University of California, Berkeley {nagaban2,cbfinn,svlevine}@berkeley.edu
Pseudocode	Yes	Algorithm 1 Online Learning with Mixture of Meta Trained Networks
Open Source Code	No	The paper provides a link for videos, not for open-source code: https://sites.google.com/berkeley.edu/onlineviameta
Open Datasets	No	The paper mentions using agents in the Mu Jo Co physics engine (Todorov et al., 2012) and training models on simulated data with varying conditions (e.g., 'random slopes of low magnitudes', 'random joints being crippled'), but it does not specify a pre-existing publicly available dataset with concrete access information.
Dataset Splits	No	The paper mentions that MAML uses 'training and validation subsets (Dtr T and Dval T)' internally during meta-training, where 'Dtr T is of size k'. However, it does not provide specific percentages, sample counts, or citations for the overall training/validation/test splits of the experimental data.
Hardware Specification	No	No specific hardware details (e.g., GPU/CPU models, memory, or cloud instance types) used for running experiments were mentioned.
Software Dependencies	No	The paper mentions using the 'Mu Jo Co physics engine' but does not specify its version or any other software dependencies with version numbers.
Experiment Setup	Yes	In all experiments, we use a dynamics model consisting of three hidden layers, each of dimension 500, with Re LU nonlinearities. ... Table 1: Hyperparameters for train-time ... Table 2: Hyperparameters for run-time