reproducibilityindex.ai

A Generalized Neural Tangent Kernel Analysis for Two-layer Neural Networks

Authors: Zixiang Chen, Yuan Cao, Quanquan Gu, Tong Zhang

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	In this paper, we provide a generalized neural tangent kernel analysis and show that noisy gradient descent with weight decay can still exhibit a kernel-like behavior. This implies that the training loss converges linearly up to a certain accuracy. We also establish a novel generalization error bound for two-layer neural networks trained by noisy gradient descent with weight decay.
Researcher Affiliation	Academia	Zixiang Chen Department of Computer Science University of California, Los Angeles Los Angeles, CA 90095, USA chenzx19@cs.ucla.edu Yuan Cao Department of Computer Science University of California, Los Angeles Los Angeles, CA 90095, USA yuancao@cs.ucla.edu Quanquan Gu Department of Computer Science University of California, Los Angeles Los Angeles, CA 90095, USA qgu@cs.ucla.edu Tong Zhang Dept. of Computer Science & Mathematics Hong Kong Univ. of Science & Technology Hong Kong, China tongzhang@tongzhang-ml.org
Pseudocode	Yes	Algorithm 1 Noisy Gradient Descent for Training Two-layer Networks
Open Source Code	No	The paper does not provide an explicit statement or link regarding the release of source code for the described methodology.
Open Datasets	No	The paper mentions a "training data set S = {(x1, y1), . . . , (xn, yn)}" as part of its problem setting and theoretical analysis, but it does not specify any particular publicly available dataset with concrete access information (link, DOI, citation).
Dataset Splits	No	The paper is theoretical and does not report on empirical experiments that would require specific training/validation/test dataset splits for reproduction.
Hardware Specification	No	The paper is theoretical and does not describe any hardware used for running experiments.
Software Dependencies	No	The paper is theoretical and does not specify any software dependencies with version numbers used for running experiments.
Experiment Setup	No	The paper is theoretical and does not provide specific experimental setup details such as hyperparameter values or training configurations for empirical reproduction.