Deep Networks Provably Classify Data on Curves

Authors: Tingran Wang, Sam Buchanan, Dar Gilboa, John Wright

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We provide a preliminary numerical experiment to this end in Appendix A.3. Training fully-connected networks with gradient descent on a simple manifold classification task, low training error appears to be easily achievable only when the decay scale of the kernel is small relative to the inter-manifold distance even at moderate depth and width, and this decay scale is controlled by the depth of the network.
Researcher Affiliation Academia Tingran Wang Columbia University tw2579@columbia.edu Sam Buchanan Columbia University s.buchanan@columbia.edu Dar Gilboa Harvard University dar_gilboa@fas.harvard.edu John Wright Columbia University jw2966@columbia.edu
Pseudocode No The paper does not contain any clearly labeled pseudocode or algorithm blocks.
Open Source Code No The paper does not contain any statements about releasing source code or links to a code repository for the described methodology.
Open Datasets No While Figure 2 mentions 't-SNE projection of MNIST images', this is for illustrative purposes. The main numerical experiments are conducted on a synthetic 'two curve problem' geometry (Figure 1) for which no concrete access information (link, citation, or repository) for the generated data is provided.
Dataset Splits No The paper describes its numerical experiments on a synthetic setup but does not specify any training, validation, or test dataset splits.
Hardware Specification No The paper does not provide any specific details about the hardware (e.g., GPU/CPU models, cloud resources) used for running the experiments.
Software Dependencies No The paper does not list specific software dependencies with version numbers.
Experiment Setup No The paper mentions 'randomly-initialized gradient descent' and a 'step size τ', but does not provide specific hyperparameter values or detailed system-level training configurations for its numerical experiments.