MAML and ANIL Provably Learn Representations
Authors: Liam Collins, Aryan Mokhtari, Sewoong Oh, Sanjay Shakkottai
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Figure 1 visualizes this observation: all meta-learning approaches (Exact ANIL, MAML, and their first-order (FO) versions that ignore second-order derivatives) approach the ground truth exponentially fast, while a non-meta learning baseline of average loss minimization empirically fails to recover the ground-truth. We prove, for the first time, that both MAML and ANIL, as well their first-order approximations, are capable of representation learning and recover the ground-truth subspace in this setting. In this section we run numerical simulations to verify our theoretical findings. |
| Researcher Affiliation | Academia | 1Department of Electrical and Computer Engineering, The University of Texas at Austin 2School of Computer Science and Engineering, University of Washington. |
| Pseudocode | No | The paper describes algorithms (MAML and ANIL) using mathematical equations and textual descriptions, but there are no clearly labeled pseudocode blocks or algorithm figures. |
| Open Source Code | No | The paper does not provide any statement about releasing source code or a link to a code repository. |
| Open Datasets | No | The paper describes a "multi-task linear representation learning framework" where data is "sampled i.i.d. from a distribution Pt,i". For numerical simulations, it states "the ground-truth heads are sampled i.i.d. from N(0, diag([1, . . . , 1, µ2]))". This indicates a synthetic data generation process within a theoretical framework rather than the use of a named, publicly available dataset. |
| Dataset Splits | No | The paper describes the synthetic data generation process for its numerical simulations, but it does not specify any training, validation, or test dataset splits (e.g., percentages or sample counts) from a predefined dataset. |
| Hardware Specification | No | The paper does not specify any hardware used for running its experiments, such as GPU or CPU models, or cloud computing specifications. |
| Software Dependencies | No | The paper describes mathematical models and algorithms (MAML, ANIL) but does not list specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions, or any specific libraries). |
| Experiment Setup | Yes | In Section 6 "Numerical simulations", the paper states: "We set d = 100 and k=n=5." and "We set d = 20, n = k = 3". It also specifies "step sizes β = α = 0.05 in all cases for Figure 4" and "larger step sizes of α = β = 0.1" for Figure 1. It further mentions "All results are averaged over 5 random trials." |