reproducibilityindex.ai

On the Global Optimality of Model-Agnostic Meta-Learning

Authors: Lingxiao Wang, Qi Cai, Zhuoran Yang, Zhaoran Wang

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	Despite its empirical success, MAML remains less understood in theory, especially in terms of its global optimality... To bridge such a gap between theory and practice, we characterize the optimality gap of the stationary points attained by MAML for both reinforcement learning and supervised learning... To the best of our knowledge, our analysis establishes the global optimality of MAML with nonconvex meta-objectives for the first time.
Researcher Affiliation	Academia	1Department of Industrial Engineering and Management Sciences, Northwestern University, USA 2Department of Operations Research and Financial Engineering, Princeton University, USA.
Pseudocode	Yes	Algorithm 1 Meta-RL Require: Sampled MDPs {(S, A, Pi, ri, γi, ζi)}i [n] from the task distribution τ, feature mapping φ, number of iterations T, learning rate {αℓ}ℓ [T ], temperature parameter 1/τ, tuning parameter τ, initial parameter θ0. 1: Initialization: 2: for ℓ= 0, . . . , T 1 do 3: for i [n] do 4: Update the policy: πi,θℓ( \| s) exp 1/τ φ(s, ) θℓ+ η Q πθℓ i (s, ) . 5: Compute the auxiliary function hi,θℓ(s, a) via (3.8) 6: end for 7: Compute the gradient of meta-objective θL(θℓ) based on the policies {πi,θℓ}i [n] and auxiliary functions {hi,θℓ}i [n] via (3.7). 8: Update the parameter of the main effect: θℓ+1 θℓ+ αℓ θL(θℓ). 9: Update the main effect: πθℓ+1( \| s) exp 1/τ φ(s, ) θℓ+1 . 10: end for 11: Output: θT and πθT .
Open Source Code	No	The paper is theoretical and focuses on analysis and proofs; there is no mention of open-sourcing code for the described methodology.
Open Datasets	No	The paper is theoretical and does not describe specific datasets used for training, nor does it provide access information for any publicly available or open datasets.
Dataset Splits	No	The paper is theoretical and does not describe specific dataset splits (training, validation, test) needed for reproduction of experiments.
Hardware Specification	No	The paper is theoretical and does not mention any specific hardware used for running experiments.
Software Dependencies	No	The paper is theoretical and does not provide specific software dependencies or version numbers.
Experiment Setup	No	The paper is theoretical and does not include specific experimental setup details, hyperparameters, or training configurations.