MAML and ANIL Provably Learn Representations

Authors: Liam Collins, Aryan Mokhtari, Sewoong Oh, Sanjay Shakkottai

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Figure 1 visualizes this observation: all meta-learning approaches (Exact ANIL, MAML, and their first-order (FO) versions that ignore second-order derivatives) approach the ground truth exponentially fast, while a non-meta learning baseline of average loss minimization empirically fails to recover the ground-truth. We prove, for the first time, that both MAML and ANIL, as well their first-order approximations, are capable of representation learning and recover the ground-truth subspace in this setting. In this section we run numerical simulations to verify our theoretical findings.
Researcher Affiliation Academia 1Department of Electrical and Computer Engineering, The University of Texas at Austin 2School of Computer Science and Engineering, University of Washington.
Pseudocode No The paper describes algorithms (MAML and ANIL) using mathematical equations and textual descriptions, but there are no clearly labeled pseudocode blocks or algorithm figures.
Open Source Code No The paper does not provide any statement about releasing source code or a link to a code repository.
Open Datasets No The paper describes a "multi-task linear representation learning framework" where data is "sampled i.i.d. from a distribution Pt,i". For numerical simulations, it states "the ground-truth heads are sampled i.i.d. from N(0, diag([1, . . . , 1, µ2]))". This indicates a synthetic data generation process within a theoretical framework rather than the use of a named, publicly available dataset.
Dataset Splits No The paper describes the synthetic data generation process for its numerical simulations, but it does not specify any training, validation, or test dataset splits (e.g., percentages or sample counts) from a predefined dataset.
Hardware Specification No The paper does not specify any hardware used for running its experiments, such as GPU or CPU models, or cloud computing specifications.
Software Dependencies No The paper describes mathematical models and algorithms (MAML, ANIL) but does not list specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions, or any specific libraries).
Experiment Setup Yes In Section 6 "Numerical simulations", the paper states: "We set d = 100 and k=n=5." and "We set d = 20, n = k = 3". It also specifies "step sizes β = α = 0.05 in all cases for Figure 4" and "larger step sizes of α = β = 0.1" for Figure 1. It further mentions "All results are averaged over 5 random trials."