On the Global Optimality of Model-Agnostic Meta-Learning

Authors: Lingxiao Wang, Qi Cai, Zhuoran Yang, Zhaoran Wang

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical Despite its empirical success, MAML remains less understood in theory, especially in terms of its global optimality... To bridge such a gap between theory and practice, we characterize the optimality gap of the stationary points attained by MAML for both reinforcement learning and supervised learning... To the best of our knowledge, our analysis establishes the global optimality of MAML with nonconvex meta-objectives for the first time.
Researcher Affiliation Academia 1Department of Industrial Engineering and Management Sciences, Northwestern University, USA 2Department of Operations Research and Financial Engineering, Princeton University, USA.
Pseudocode Yes Algorithm 1 Meta-RL Require: Sampled MDPs {(S, A, Pi, ri, γi, ζi)}i [n] from the task distribution τ, feature mapping φ, number of iterations T, learning rate {αℓ}ℓ [T ], temperature parameter 1/τ, tuning parameter τ, initial parameter θ0. 1: Initialization: 2: for ℓ= 0, . . . , T 1 do 3: for i [n] do 4: Update the policy: πi,θℓ( | s) exp 1/τ φ(s, ) θℓ+ η Q πθℓ i (s, ) . 5: Compute the auxiliary function hi,θℓ(s, a) via (3.8) 6: end for 7: Compute the gradient of meta-objective θL(θℓ) based on the policies {πi,θℓ}i [n] and auxiliary functions {hi,θℓ}i [n] via (3.7). 8: Update the parameter of the main effect: θℓ+1 θℓ+ αℓ θL(θℓ). 9: Update the main effect: πθℓ+1( | s) exp 1/τ φ(s, ) θℓ+1 . 10: end for 11: Output: θT and πθT .
Open Source Code No The paper is theoretical and focuses on analysis and proofs; there is no mention of open-sourcing code for the described methodology.
Open Datasets No The paper is theoretical and does not describe specific datasets used for training, nor does it provide access information for any publicly available or open datasets.
Dataset Splits No The paper is theoretical and does not describe specific dataset splits (training, validation, test) needed for reproduction of experiments.
Hardware Specification No The paper is theoretical and does not mention any specific hardware used for running experiments.
Software Dependencies No The paper is theoretical and does not provide specific software dependencies or version numbers.
Experiment Setup No The paper is theoretical and does not include specific experimental setup details, hyperparameters, or training configurations.