Eigenvalue Decay Implies Polynomial-Time Learnability for Neural Networks
Authors: Surbhi Goel, Adam Klivans
NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | In this work we show that a natural distributional assumption corresponding to eigenvalue decay of the Gram matrix yields polynomial-time algorithms in the non-realizable setting for expressive classes of networks (e.g. feed-forward networks of Re LUs). We make no assumptions on the structure of the network or the labels. Given sufficiently-strong eigenvalue decay, we obtain fully-polynomial time algorithms in all the relevant parameters with respect to square-loss. This is the first purely distributional assumption that leads to polynomial-time algorithms for networks of Re LUs. Further, unlike prior distributional assumptions (e.g., the marginal distribution is Gaussian), eigenvalue decay has been observed in practice on common data sets. |
| Researcher Affiliation | Academia | Surbhi Goel Department of Computer Science University of Texas at Austin surbhi@cs.utexas.edu Adam Klivans Department of Computer Science University of Texas at Austin klivans@cs.utexas.edu |
| Pseudocode | Yes | Algorithm 1 Compressed Kernel Regression |
| Open Source Code | No | The paper does not provide explicit statements or links for open-source code for the described methodology. |
| Open Datasets | No | The paper is theoretical and does not describe training on a specific, publicly available dataset with access information. It discusses 'common data sets' in a general sense but no details for reproducibility. |
| Dataset Splits | No | The paper is theoretical and does not describe empirical experiments, thus no validation dataset splits are mentioned. |
| Hardware Specification | No | The paper does not mention any specific hardware used for running experiments. |
| Software Dependencies | No | The paper does not list specific software dependencies with version numbers. |
| Experiment Setup | No | The paper is theoretical and does not describe empirical experiments; therefore, it does not provide details on experimental setup or hyperparameters. |