Mean-Field Analysis for Learning Subspace-Sparse Polynomials with Gaussian Input
Authors: Ziang Chen, Rong Ge
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | In this work, we study the mean-field flow for learning subspace-sparse polynomials using stochastic gradient descent and two-layer neural networks, where the input distribution is standard Gaussian and the output only depends on the projection of the input onto a low-dimensional subspace. We establish a necessary condition for SGD-learnability, involving both the characteristics of the target function and the expressiveness of the activation function. In addition, we prove that the condition is almost sufficient, in the sense that a condition slightly stronger than the necessary condition can guarantee the exponential decay of the loss functional to zero. |
| Researcher Affiliation | Academia | Ziang Chen Department of Mathematics Massachusetts Institute of Technology Cambridge, MA 02139 ziang@mit.edu Rong Ge Department of Computer Science and Department of Mathematics Duke University Durham, NC 27708 rongge@cs.duke.edu |
| Pseudocode | Yes | Algorithm 1 Training strategy |
| Open Source Code | No | This paper is purely theoretical and does not provide open-source code for the described methodology. |
| Open Datasets | No | This paper is purely theoretical and does not include empirical training with a dataset. |
| Dataset Splits | No | This paper is purely theoretical and does not include empirical validation with dataset splits. |
| Hardware Specification | No | This paper is purely theoretical and does not include experiments, thus no hardware specifications are provided. |
| Software Dependencies | No | This paper is purely theoretical and does not include experiments, thus no specific software dependencies with version numbers are listed. |
| Experiment Setup | No | This paper is purely theoretical and does not include empirical experiments with specific setup details like hyperparameters. |