On the Stability and Generalization of Meta-Learning

Authors: Yunjuan Wang, Raman Arora

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We focus on developing a theoretical understanding of meta-learning... We introduce a novel notion of stability for meta-learning algorithms... and give explicit generalization bounds... We also conduct a simple experiment to empirically verify our generalization bounds... We report the transfer risk, the average empirical risk (over tasks), and the generalization gap for different values of m and n in Figure 1.
Researcher Affiliation Academia Yunjuan Wang Department of Computer Science Johns Hopkins University Baltimore, MD, 21218 ywang509@jhu.edu Raman Arora Department of Computer Science Johns Hopkins University Baltimore, MD, 21218 arora@cs.jhu.edu
Pseudocode Yes Algorithm 1 Prox Meta-Learning Algorithm A, Algorithm 2 Task-specific Algorithm Atask, Algorithm 3 Stochastic Prox Meta-Learning
Open Source Code Yes The code is provided in the supplementary file. (Neur IPS Paper Checklist - E. Open access to data and code)
Open Datasets No The paper uses a synthetic one-dimensional sine wave regression problem where data is generated by the authors based on a function f(x; α, β) = α sin(x + β) with parameters sampled from uniform distributions. No concrete access information (link, DOI, repository, or citation to an external public dataset) is provided for the generated data itself.
Dataset Splits No The paper describes how training tasks (m tasks, each with n samples) and test tasks (new unseen tasks with n samples) are generated. It also mentions an 'evaluation set of size 200' for test tasks. However, it does not explicitly state a separate 'validation' split or how hyperparameters were tuned using such a split.
Hardware Specification Yes The experiment is conducted on a T4 GPU.
Software Dependencies No The paper does not provide specific software dependencies with version numbers. It does not mention libraries, frameworks, or solvers with their corresponding versions.
Experiment Setup Yes We run Algorithm 1 for T = 100 iterations with a step size of γ = 0.1 and regularization parameter λ = 0.5. Algorithm 2 (GD) is run for K = 15 iterations with step size η = 0.02.