Implicit Bias of MSE Gradient Optimization in Underparameterized Neural Networks
Authors: Benjamin Bowman, Guido Montufar
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We study the dynamics of a neural network in function space when optimizing the mean squared error via gradient flow. We show that in the underparameterized regime the network learns eigenfunctions of an integral operator TK determined by the Neural Tangent Kernel (NTK) at rates corresponding to their eigenvalues. Our results can be understood as describing a spectral bias in the underparameterized regime. The proofs use the concept of Damped Deviations. |
| Researcher Affiliation | Academia | Benjamin Bowman UCLA Departments of Mathematics benbowman314@math.ucla.edu Guido Mont ufar UCLA Departments of Mathematics and Statistics and MPI MIS montufar@math.ucla.edu |
| Pseudocode | No | The paper is theoretical and focuses on mathematical proofs and analysis; it does not contain pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any explicit statements about the release of source code or links to a code repository. |
| Open Datasets | No | The paper is theoretical and does not perform empirical experiments with specific datasets. It discusses 'training data' as a conceptual element in its theoretical framework ('Our training data consists of n input-label pairs'), but not as a concrete, publicly available dataset used for evaluation. |
| Dataset Splits | No | The paper is theoretical and does not conduct experiments with dataset splits for training, validation, or testing. |
| Hardware Specification | No | The paper is theoretical and does not describe any hardware used for experiments. |
| Software Dependencies | No | The paper is theoretical and does not mention specific software dependencies with version numbers. |
| Experiment Setup | No | The paper is theoretical and does not describe an experimental setup with hyperparameters or training settings. |