Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks

Authors: Chelsea Finn, Pieter Abbeel, Sergey Levine

ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The goal of our experimental evaluation is to answer the following questions: (1) Can MAML enable fast learning of new tasks? (2) Can MAML be used for meta-learning in multiple different domains, including supervised regression, classification, and reinforcement learning? (3) Can a model learned with MAML continue to improve with additional gradient updates and/or examples. All of the experiments were performed using Tensor Flow (Abadi et al., 2016), which allows for automatic differentiation through the gradient update(s) during meta-learning.
Researcher Affiliation Collaboration 1University of California, Berkeley 2Open AI. Correspondence to: Chelsea Finn <cbfinn@eecs.berkeley.edu>.
Pseudocode Yes Algorithm 1 Model-Agnostic Meta-Learning
Open Source Code Yes The code is available online1. 1Code for the regression and supervised experiments is at github.com/cbfinn/maml and code for the RL experiments is at github.com/cbfinn/maml_rl
Open Datasets Yes To evaluate MAML in comparison to prior meta-learning and few-shot learning algorithms, we applied our method to few-shot image recognition on the Omniglot (Lake et al., 2011) and Mini Imagenet datasets. ... we constructed several sets of tasks based off of the simulated continuous control environments in the rllab benchmark suite (Duan et al., 2016a).
Dataset Splits Yes The Mini Imagenet dataset was proposed by Ravi & Larochelle (2017), and involves 64 training classes, 12 validation classes, and 24 test classes. ... For Omniglot, we randomly select 1200 characters for training, irrespective of alphabet, and use the remaining for testing.
Hardware Specification No The paper mentions that experiments were performed using TensorFlow, but does not provide specific hardware details such as GPU/CPU models or memory.
Software Dependencies No All of the experiments were performed using Tensor Flow (Abadi et al., 2016), which allows for automatic differentiation through the gradient update(s) during meta-learning. ... When training with MAML, we use one gradient update with K = 10 examples with a fixed step size α = 0.01, and use Adam as the metaoptimizer (Kingma & Ba, 2015). ... we use vanilla policy gradient (REINFORCE) (Williams, 1992), and we use trust-region policy optimization (TRPO) as the meta-optimizer (Schulman et al., 2015).
Experiment Setup Yes When training with MAML, we use one gradient update with K = 10 examples with a fixed step size α = 0.01, and use Adam as the meta-optimizer (Kingma & Ba, 2015). The regressor is a neural network model with 2 hidden layers of size 40 with Re LU nonlinearities. ... Our model follows the same architecture as the embedding function used by Vinyals et al. (2016), which has 4 modules with a 3 3 convolutions and 64 filters, followed by batch normalization (Ioffe & Szegedy, 2015), a Re LU nonlinearity, and 2 2 max-pooling. ... The gradient updates are computed using vanilla policy gradient (REINFORCE) (Williams, 1992), and we use trust-region policy optimization (TRPO) as the meta-optimizer (Schulman et al., 2015). The horizon is H = 200, with 20 rollouts per gradient step for all problems except the ant forward/backward task, which used 40 rollouts per step.