Provable Generalization of Overparameterized Meta-learning Trained with SGD

Authors: Yu Huang, Yingbin Liang, Longbo Huang

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our theoretical findings are further validated by experiments. Figures 1, 2, and 3 provide experimental results.
Researcher Affiliation Academia Yu Huang IIIS Tsinghua University y-huang20@mails.tsinghua.edu.cn Yingbin Liang Department of ECE The Ohio State University liang.889@osu.edu Longbo Huang IIIS Tsinghua University longbohuang@tsinghua.edu.cn
Pseudocode Yes Algorithm 1 MAML with SGD
Open Source Code No The paper includes a self-assessment indicating code is provided in supplemental material, but the main text does not contain a specific statement or URL for open-source code.
Open Datasets No The paper uses a 'mixed linear regression model' and defines data distributions (e.g., 'x Rd is mean zero with covariance operator Σ = E[xx ]'). While it describes the model, it does not refer to or provide access information for a named public dataset.
Dataset Splits Yes Suppose that Dt is randomly split into training and validation sets, denoted respectively as Din t (Xin t , yin t ) and Dout t (Xout t , yout t ), correspondingly containing n1 and n2 samples (i.e., N = n1 + n2).
Hardware Specification No The paper does not specify any hardware details (e.g., GPU/CPU models, memory) used for running experiments.
Software Dependencies No The paper does not specify any software dependencies with version numbers.
Experiment Setup Yes d = 500, T = 300, λi = 1 i log(i+1)2 , βtr = 0.02, βte = 0.2 (Figure 1 caption). d = 200, T = 100, Σθ = 0.82 d I, βte = 0.2 (Figure 2 caption). Let s = T log p(T) and d = T logq(T), where p, q > 0. Suppose Px is Gaussian and the spectrum of Σ satisfies λk = 1/s, k s 1/(d s), s + 1 k d. Suppose the spectral parameter νi of Σθ is O(1), and let the step size α = 1 2c(βtr,Σ) tr(Σ). (Proposition 4).