Towards Understanding the Spectral Bias of Deep Learning

Authors: Yuan Cao, Zhiying Fang, Yue Wu, Ding-Xuan Zhou, Quanquan Gu

IJCAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, we provide numerical experiments to demonstrate the correctness of our theory. Our experimental results also show that our theory can tolerate certain model misspecification in terms of the input data distribution. We also conduct experiments to corroborate the theory we establish. In this section we present experimental results to verify our theory.
Researcher Affiliation Academia 1Department of Computer Science, University of California, Los Angeles 2School of Data Science and Department of Mathematics, City University of Hong Kong
Pseudocode Yes Algorithm 1 GD for DNNs starting at Gaussian initialization
Open Source Code No The paper does not contain any explicit statement about releasing source code for the described methodology or a link to a code repository.
Open Datasets No The paper mentions generating synthetic data based on spherical harmonics and non-uniform distributions, but does not provide concrete access (e.g., a link or formal citation to a public repository) for these datasets.
Dataset Splits No The paper mentions a 'training sample size is 1000' but does not specify explicit train/validation/test splits, percentages, or sample counts for dataset partitioning.
Hardware Specification No The paper does not specify any hardware details such as GPU models, CPU types, or memory used for the experiments.
Software Dependencies No The paper mentions 'vanilla gradient descent' but does not list specific software dependencies with version numbers (e.g., Python version, PyTorch version).
Experiment Setup Yes Across all tasks, we train a two-layer neural networks with 4096 hidden neurons and initialize it exactly as defined in the problem setup. The optimization method is vanilla gradient descent, and the training sample size is 1000. Algorithm 1 with η r Opm 1θ 2q, θ r Opϵq satisfies