Implicit Bias of MSE Gradient Optimization in Underparameterized Neural Networks

Authors: Benjamin Bowman, Guido Montufar

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We study the dynamics of a neural network in function space when optimizing the mean squared error via gradient flow. We show that in the underparameterized regime the network learns eigenfunctions of an integral operator TK determined by the Neural Tangent Kernel (NTK) at rates corresponding to their eigenvalues. Our results can be understood as describing a spectral bias in the underparameterized regime. The proofs use the concept of Damped Deviations.
Researcher Affiliation Academia Benjamin Bowman UCLA Departments of Mathematics benbowman314@math.ucla.edu Guido Mont ufar UCLA Departments of Mathematics and Statistics and MPI MIS montufar@math.ucla.edu
Pseudocode No The paper is theoretical and focuses on mathematical proofs and analysis; it does not contain pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any explicit statements about the release of source code or links to a code repository.
Open Datasets No The paper is theoretical and does not perform empirical experiments with specific datasets. It discusses 'training data' as a conceptual element in its theoretical framework ('Our training data consists of n input-label pairs'), but not as a concrete, publicly available dataset used for evaluation.
Dataset Splits No The paper is theoretical and does not conduct experiments with dataset splits for training, validation, or testing.
Hardware Specification No The paper is theoretical and does not describe any hardware used for experiments.
Software Dependencies No The paper is theoretical and does not mention specific software dependencies with version numbers.
Experiment Setup No The paper is theoretical and does not describe an experimental setup with hyperparameters or training settings.