Theoretical Characterization of the Generalization Performance of Overfitted Meta-Learning
Authors: Peizhong Ju, Yingbin Liang, Ness Shroff
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | As an initial step towards addressing this challenge, this paper studies the generalization performance of overfitted meta-learning under a linear regression model with Gaussian features. ... With this upper bound and simulation results, we confirm the benign overfitting in meta-learning by comparing the model error of the overfitted solution with the underfitted solution. We further characterize some interesting properties of the descent curve. ... In Appendix E.2, we provide a further experiment where we train a two-layer fully connected neural network over the MNIST data set. |
| Researcher Affiliation | Academia | Peizhong Ju Department of ECE The Ohio State University Columbus, OH 43210, USA ju.171@osu.edu Yingbin Liang Department of ECE The Ohio State University Columbus, OH 43210, USA liang.889@osu.edu Ness B. Shroff Department of ECE & CSE The Ohio State University Columbus, OH 43210, USA shroff.11@osu.edu |
| Pseudocode | No | No structured pseudocode or algorithm blocks were found. |
| Open Source Code | No | The paper does not contain any statement about open-sourcing code or provide a link to a code repository. |
| Open Datasets | Yes | In this section, we further verify our theoretical findings by an experiment over a two-layer fully-connected neural network on the MNIST data set. |
| Dataset Splits | Yes | For each training task, there are 1000 training samples and 100 validation samples. ... The number of validation samples is nv = 100 for each of these 4 training tasks. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., CPU, GPU models, or memory) used for running experiments. |
| Software Dependencies | No | The paper mentions software components like "deep neural networks" but does not list specific software dependencies (e.g., libraries, frameworks) with version numbers required for replication. |
| Experiment Setup | Yes | The step size in the outer-loop training is 0.3 and the step size of the one-step gradient adaptation is \u03b1t = \u03b1r = 0.05. After training 500 epochs, the meta-training error for each simulation is lower than 0.025 (the range of the meta-training error is [0, 1]). |