Learnability of high-dimensional targets by two-parameter models and gradient flow
Authors: Dmitry Yarotsky
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We explore the theoretical possibility of learning d-dimensional targets with Wparameter models by gradient flow (GF) when W < d. Our main result shows that if the targets are described by a particular d-dimensional probability distribution, then there exist models with as few as two parameters that can learn the targets with arbitrarily high success probability. On the other hand, we show that for W < d there is necessarily a large subset of GF-non-learnable targets. In particular, the set of learnable targets is not dense in Rd, and any subset of Rd homeomorphic to the W-dimensional sphere contains non-learnable targets. Finally, we observe that the model in our main theorem on almost guaranteed two-parameter learning is constructed using a hierarchical procedure and as a result is not expressible by a single elementary function. We show that this limitation is essential in the sense that most models written in terms of elementary functions cannot achieve the learnability demonstrated in this theorem. |
| Researcher Affiliation | Academia | Dmitry Yarotsky Skoltech d.yarotsky@skoltech.ru |
| Pseudocode | No | The paper describes mathematical constructions and proofs in prose and equations, but it does not contain any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper is purely theoretical and does not mention releasing any source code for its methodology. The NeurIPS checklist further confirms that the paper does not include experiments, implying no code release. |
| Open Datasets | No | The paper is theoretical and does not conduct experiments involving specific training datasets. It refers to abstract 'target spaces' and 'probability distributions' (e.g., 'ยต be any Borel probability measure on H'), but these are theoretical constructs, not concrete datasets used for empirical evaluation. |
| Dataset Splits | No | The paper is purely theoretical and does not involve empirical evaluation with dataset splits (training, validation, or testing). |
| Hardware Specification | No | The paper is purely theoretical and does not describe any empirical experiments, therefore no hardware specifications are mentioned. |
| Software Dependencies | No | The paper is purely theoretical and does not describe any empirical experiments, therefore no software dependencies with version numbers are mentioned. |
| Experiment Setup | No | The paper is purely theoretical and does not include any empirical experiments. As such, there are no details regarding experimental setup, hyperparameters, or system-level training settings. |