On the Global Optimality of Model-Agnostic Meta-Learning
Authors: Lingxiao Wang, Qi Cai, Zhuoran Yang, Zhaoran Wang
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | Despite its empirical success, MAML remains less understood in theory, especially in terms of its global optimality... To bridge such a gap between theory and practice, we characterize the optimality gap of the stationary points attained by MAML for both reinforcement learning and supervised learning... To the best of our knowledge, our analysis establishes the global optimality of MAML with nonconvex meta-objectives for the first time. |
| Researcher Affiliation | Academia | 1Department of Industrial Engineering and Management Sciences, Northwestern University, USA 2Department of Operations Research and Financial Engineering, Princeton University, USA. |
| Pseudocode | Yes | Algorithm 1 Meta-RL Require: Sampled MDPs {(S, A, Pi, ri, γi, ζi)}i [n] from the task distribution τ, feature mapping φ, number of iterations T, learning rate {αℓ}ℓ [T ], temperature parameter 1/τ, tuning parameter τ, initial parameter θ0. 1: Initialization: 2: for ℓ= 0, . . . , T 1 do 3: for i [n] do 4: Update the policy: πi,θℓ( | s) exp 1/τ φ(s, ) θℓ+ η Q πθℓ i (s, ) . 5: Compute the auxiliary function hi,θℓ(s, a) via (3.8) 6: end for 7: Compute the gradient of meta-objective θL(θℓ) based on the policies {πi,θℓ}i [n] and auxiliary functions {hi,θℓ}i [n] via (3.7). 8: Update the parameter of the main effect: θℓ+1 θℓ+ αℓ θL(θℓ). 9: Update the main effect: πθℓ+1( | s) exp 1/τ φ(s, ) θℓ+1 . 10: end for 11: Output: θT and πθT . |
| Open Source Code | No | The paper is theoretical and focuses on analysis and proofs; there is no mention of open-sourcing code for the described methodology. |
| Open Datasets | No | The paper is theoretical and does not describe specific datasets used for training, nor does it provide access information for any publicly available or open datasets. |
| Dataset Splits | No | The paper is theoretical and does not describe specific dataset splits (training, validation, test) needed for reproduction of experiments. |
| Hardware Specification | No | The paper is theoretical and does not mention any specific hardware used for running experiments. |
| Software Dependencies | No | The paper is theoretical and does not provide specific software dependencies or version numbers. |
| Experiment Setup | No | The paper is theoretical and does not include specific experimental setup details, hyperparameters, or training configurations. |