Deep Networks Provably Classify Data on Curves
Authors: Tingran Wang, Sam Buchanan, Dar Gilboa, John Wright
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We provide a preliminary numerical experiment to this end in Appendix A.3. Training fully-connected networks with gradient descent on a simple manifold classification task, low training error appears to be easily achievable only when the decay scale of the kernel is small relative to the inter-manifold distance even at moderate depth and width, and this decay scale is controlled by the depth of the network. |
| Researcher Affiliation | Academia | Tingran Wang Columbia University tw2579@columbia.edu Sam Buchanan Columbia University s.buchanan@columbia.edu Dar Gilboa Harvard University dar_gilboa@fas.harvard.edu John Wright Columbia University jw2966@columbia.edu |
| Pseudocode | No | The paper does not contain any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain any statements about releasing source code or links to a code repository for the described methodology. |
| Open Datasets | No | While Figure 2 mentions 't-SNE projection of MNIST images', this is for illustrative purposes. The main numerical experiments are conducted on a synthetic 'two curve problem' geometry (Figure 1) for which no concrete access information (link, citation, or repository) for the generated data is provided. |
| Dataset Splits | No | The paper describes its numerical experiments on a synthetic setup but does not specify any training, validation, or test dataset splits. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., GPU/CPU models, cloud resources) used for running the experiments. |
| Software Dependencies | No | The paper does not list specific software dependencies with version numbers. |
| Experiment Setup | No | The paper mentions 'randomly-initialized gradient descent' and a 'step size τ', but does not provide specific hyperparameter values or detailed system-level training configurations for its numerical experiments. |