A Generalized Neural Tangent Kernel Analysis for Two-layer Neural Networks
Authors: Zixiang Chen, Yuan Cao, Quanquan Gu, Tong Zhang
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | In this paper, we provide a generalized neural tangent kernel analysis and show that noisy gradient descent with weight decay can still exhibit a kernel-like behavior. This implies that the training loss converges linearly up to a certain accuracy. We also establish a novel generalization error bound for two-layer neural networks trained by noisy gradient descent with weight decay. |
| Researcher Affiliation | Academia | Zixiang Chen Department of Computer Science University of California, Los Angeles Los Angeles, CA 90095, USA chenzx19@cs.ucla.edu Yuan Cao Department of Computer Science University of California, Los Angeles Los Angeles, CA 90095, USA yuancao@cs.ucla.edu Quanquan Gu Department of Computer Science University of California, Los Angeles Los Angeles, CA 90095, USA qgu@cs.ucla.edu Tong Zhang Dept. of Computer Science & Mathematics Hong Kong Univ. of Science & Technology Hong Kong, China tongzhang@tongzhang-ml.org |
| Pseudocode | Yes | Algorithm 1 Noisy Gradient Descent for Training Two-layer Networks |
| Open Source Code | No | The paper does not provide an explicit statement or link regarding the release of source code for the described methodology. |
| Open Datasets | No | The paper mentions a "training data set S = {(x1, y1), . . . , (xn, yn)}" as part of its problem setting and theoretical analysis, but it does not specify any particular publicly available dataset with concrete access information (link, DOI, citation). |
| Dataset Splits | No | The paper is theoretical and does not report on empirical experiments that would require specific training/validation/test dataset splits for reproduction. |
| Hardware Specification | No | The paper is theoretical and does not describe any hardware used for running experiments. |
| Software Dependencies | No | The paper is theoretical and does not specify any software dependencies with version numbers used for running experiments. |
| Experiment Setup | No | The paper is theoretical and does not provide specific experimental setup details such as hyperparameter values or training configurations for empirical reproduction. |