reproducibilityindex.ai

Generalization of Model-Agnostic Meta-Learning Algorithms: Recurring and Unseen Tasks

Authors: Alireza Fallah, Aryan Mokhtari, Asuman Ozdaglar

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	In this paper, we study the generalization properties of Model-Agnostic Meta Learning (MAML) algorithms for supervised learning problems. We focus on the setting in which we train the MAML model over m tasks, each with n data points, and characterize its generalization error from two points of view: First, we assume the new task at test time is one of the training tasks, and we show that, for strongly convex objective functions, the expected excess population loss is bounded by O(1/mn). Second, we consider the MAML algorithm s generalization to an unseen task and show that the resulting generalization error depends on the total variation distance between the underlying distributions of the new task and the tasks observed during the training process. Our proof techniques rely on the connections between algorithmic stability and generalization bounds of algorithms. In particular, we propose a new deﬁnition of stability for meta-learning algorithms, which allows us to capture the role of both the number of tasks m and number of samples per task n on the generalization error of MAML.
Researcher Affiliation	Academia	Alireza Fallah EECS Department Massachusetts Institute of Technology afallah@mit.edu Aryan Mokhtari ECE Department The University of Texas at Austin mokhtari@austin.utexas.edu Asuman Ozdaglar EECS Department Massachusetts Institute of Technology asuman@mit.edu
Pseudocode	Yes	Algorithm 1: MAML [1] Input: The set of datasets S = {Si}m i=1 with Si = {Sin i , Sout i }; test time batch size K; # of tasks summoned at each round r; # of iterations T. Choose arbitrary initial point w0 W; for t = 0 to T 1 do Choose r tasks uniformly at random (out of m tasks) and store their indices in Bt; for all Ti with i Bt do Sample a batch Dt,in i of K different elements from Sin i with replacement; Sample a batch Dt,out i of size b from Sout i and with replacement; wt+1 i := wt βt Id α 2 ˆL(wt, Dt,in i ) ˆL wt α ˆL(wt, Dt,in i ), Dt,out i ; end for wt+1 := r W 1 i Bt wt+1 i ; end for Return: w T and w T := 1 T +1 PT t=0 wt
Open Source Code	No	The paper does not provide any statement or link regarding the availability of open-source code for the described methodology.
Open Datasets	No	The paper is theoretical and does not mention using specific publicly available datasets for training or empirical evaluation.
Dataset Splits	No	The paper is theoretical and does not specify training, validation, or test dataset splits.
Hardware Specification	No	The paper is theoretical and does not mention any hardware specifications used for running experiments.
Software Dependencies	No	The paper is theoretical and does not provide specific ancillary software details with version numbers.
Experiment Setup	No	The paper is theoretical and does not provide specific experimental setup details such as hyperparameter values or training configurations.