Does Preprocessing Help Training Over-parameterized Neural Networks?
Authors: Zhao Song, Shuo Yang, Ruizhe Zhang
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | From the technical perspective, our result is a sophisticated combination of tools in different fields, greedy-type convergence analysis in optimization, sparsity observation in practical work, high-dimensional geometric search in data structure, concentration and anti-concentration in probability. Our results also provide theoretical insights for a large number of previously established fast training methods. In addition, our classical algorithm can be generalized to the Quantum computation model. Interestingly, we can get a similar sublinear cost per iteration but avoid preprocessing initial weights or input data points. |
| Researcher Affiliation | Collaboration | Zhao Song Adobe Research zsong@adobe.com Shuo Yang The University of Texas at Austin yangshuo_ut@utexas.edu Ruizhe Zhang The University of Texas at Austin ruizhe@utexas.edu |
| Pseudocode | Yes | Algorithm 1 Half Space Report Data Structure, Algorithm 2 Training Neural Network via building a data structure of weights of the neural network, Algorithm 3 Training Neural Network via building a data-structure of the input data points |
| Open Source Code | No | Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? A: [N/A] |
| Open Datasets | No | The paper is theoretical and does not involve experiments with specific datasets. It refers to 'n data points in d-dimensional space' but does not specify a publicly available dataset or provide access information for any data used. |
| Dataset Splits | No | The paper is theoretical and does not include an experimental section with data splits (training, validation, test). |
| Hardware Specification | No | The paper is theoretical and does not conduct experiments, therefore no hardware specifications are mentioned. |
| Software Dependencies | No | The paper is theoretical and does not conduct experiments, therefore no specific software dependencies with version numbers are listed. |
| Experiment Setup | No | The paper is theoretical and does not contain an experimental section, therefore no specific experimental setup details or hyperparameters are provided. |