On Learning Over-parameterized Neural Networks: A Functional Approximation Perspective
Authors: Lili Su, Pengkun Yang
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | A numerical illustration of the decay of λmin(H) in n is presented in Fig. 1a. A numerical illustration of the spectrum concentration of K is given in Fig. 1b; Training with f being randomly generated linear or quadratic functions with n = 1000, m = 2000. |
| Researcher Affiliation | Academia | Lili Su CSAIL, MIT lilisu@mit.edu Pengkun Yang Department of Electrical Engineering Princeton University pengkuny@princeton.edu |
| Pseudocode | No | The paper describes the gradient descent update rules and initialization steps in paragraph text and mathematical equations (e.g., (3), (5)) but does not include a formally labeled 'Pseudocode' or 'Algorithm' block. |
| Open Source Code | No | The paper does not provide an explicit statement or link for open-source code related to the described methodology. |
| Open Datasets | No | The paper describes data generation from a distribution (e.g., 'uniform distribution on the spheres') and mentions 'training with f being randomly generated linear or quadratic functions', but it does not specify or provide access information (link, citation to a public dataset) for any publicly available dataset used for its numerical illustrations. |
| Dataset Splits | No | The paper does not explicitly provide training/validation/test dataset splits. It mentions 'training dataset' but no specific percentages or counts for different splits. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory amounts) used for its numerical illustrations or computations. |
| Software Dependencies | No | The paper does not provide specific ancillary software details, such as library or solver names with version numbers, needed to replicate the experiment. |
| Experiment Setup | Yes | For each k = 1, , m/2: Initialize w2k 1 N(0, I), and a2k 1 = 1 with probability 1/2, and a2k 1 = -1 with probability 1/2. Initialize w2k = w2k 1 and a2k = a2k 1. All randomnesses in this initialization are independent, and are independent of the dataset. where > 0 is stepsize/learning rate. Training with f being randomly generated linear or quadratic functions with n = 1000, m = 2000. |