Understanding Benign Overfitting in Gradient-Based Meta Learning
Authors: Lisha Chen, Songtao Lu, Tianyi Chen
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | While our analysis uses the relatively tractable linear models, our theory contributes to understanding the delicate interplay among data heterogeneity, model adaptation and benign overfitting in gradient-based meta learning tasks. We corroborate our theoretical claims through numerical simulations. |
| Researcher Affiliation | Collaboration | Lisha Chen Rensselaer Polytechnic Institute Troy, NY, USA chenl21@rpi.edu Songtao Lu IBM Research Yorktown Heights, NY, USA songtao@ibm.com Tianyi Chen Rensselaer Polytechnic Institute Troy, NY, USA chentianyi19@gmail.com |
| Pseudocode | No | The paper does not contain any pseudocode or algorithm blocks. |
| Open Source Code | No | Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [N/A] Work of theoretical nature. |
| Open Datasets | No | The paper discusses 'numerical simulations' based on a 'meta linear regression model' with 'Assumptions 2-4' about data properties, but does not specify a publicly available dataset used for these simulations. |
| Dataset Splits | Yes | For each task m, we observe N samples with input feature xm Xm Rd and target label ym Ym R drawn i.i.d. from a task-specific data distribution Pm. These samples are collected in the dataset Dm = {(xm,n, ym,n)}N n=1, which is divided into the train and validation datasets, denoted as Dtr m and Dva m. And |Dtr m| = Ntr and |Dva m| = Nva with N = Ntr + Nva. |
| Hardware Specification | No | The paper does not specify any hardware details (e.g., CPU, GPU models, or memory) used for running the numerical simulations. |
| Software Dependencies | No | The paper does not provide any specific software dependencies with version numbers used for the numerical simulations. |
| Experiment Setup | Yes | Figure 3: Excess risk vs number of samples (N) with different hyperparameters (M = 10, d = 200). Example 1 (Data covariance): Suppose Qm = diag(Id1, βId d1), m. Set M = 10, d = 200, d1 = 20, α = 0.1 for MAML and γ = 103 for i MAML. |