Overparameterization Improves Robustness to Covariate Shift in High Dimensions

Authors: Nilesh Tripuraneni, Ben Adlam, Jeffrey Pennington

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Figure 1: The asymptotic predictions of Thm. 5.1 as a function of the overparameterization ratio (φ/ = n1/m) and the shift power ( ) for the (2, )-diatomic LJSD (Eq. (9)) with φ = n0/m = 0.5, σ = Re LU, γ = 0.001, and σ2 = 0.1. ... Markers in (d,e,f) show simulations for n0 = 512 and agree well with the asymptotic predictions.
Researcher Affiliation Collaboration Nilesh Tripuraneni U.C. Berkeley nilesh_tripuraneni@berkeley.edu Ben Adlam Brain Team, Google Research adlam@google.com Jeffrey Pennington Brain Team, Google Research jpennin@google.com
Pseudocode No The paper presents mathematical formulas and theoretical derivations but does not include structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any explicit statement or link regarding the release of open-source code for the described methodology.
Open Datasets Yes We consider the task of learning an unknown function from m i.i.d. samples (xi, yi) 2 Rn0 R for i 2 {1, . . . , m}, where the covariates are Gaussian, xi N(0, ) with positive definite covariance matrix , and the labels are generated by a linear function parameterized by β 2 Rn0, drawn from N (0, In0).
Dataset Splits No The paper describes the training and test distributions and their characteristics but does not explicitly mention or specify any validation dataset splits.
Hardware Specification No The paper mentions running simulations (e.g., "simulations for n0 = 512" in Figure 1), but it does not provide any specific details about the hardware used, such as GPU/CPU models, memory, or cloud resources.
Software Dependencies No The paper does not provide specific software dependencies, such as programming languages or libraries with their version numbers, that would be needed to replicate the experiments.
Experiment Setup Yes Figure 1: The asymptotic predictions of Thm. 5.1 as a function of the overparameterization ratio (φ/ = n1/m) and the shift power ( ) for the (2, )-diatomic LJSD (Eq. (9)) with φ = n0/m = 0.5, σ = Re LU, γ = 0.001, and σ2 = 0.1. ... Numerical predictions from Thm. 5.1 can be obtained by first solving the self-consistent equation for x by fixed-point iteration, x 7! 1 γ !+I1,1 , and then plugging the result into the remaining terms.