Neural Tangent Kernel Analysis of Deep Narrow Neural Networks

Authors: Jongmin Lee, Joo Young Choi, Ernest K Ryu, Albert No

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we experimentally demonstrate the invariance of the scaled NTK and the trainability of deep neural networks. The code is provided as supplementary material.
Researcher Affiliation Academia 1Department of Mathematical Sciences, Seoul National University, Seoul, Korea 2Department of Electronic and Electrical Engineering, Hongik University, Seoul, Korea.
Pseudocode No The paper contains detailed mathematical derivations and proofs but does not include any pseudocode or algorithm blocks.
Open Source Code Yes The code is provided as supplementary material.
Open Datasets Yes Next, we demonstrate the empirical trainability of the deep narrow networks on the MNIST dataset.
Dataset Splits No The paper mentions training and testing on the MNIST dataset, but it does not specify any explicit validation dataset splits (e.g., percentages, sample counts, or predefined splits).
Hardware Specification No The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies No The paper states that 'The code is provided as supplementary material' but does not list any specific software dependencies with version numbers.
Experiment Setup Yes We train L-layer MLPs with din = 784 and dout = 10 using the quadratic loss with one-hot vectors as targets. To establish a point of comparison, we attempt to train a 1000-layer MLP with the typical Kaiming He uniform initialization (He et al., 2015). We tuned the learning rate via a grid search from 0.00001 to 1.0, but the network was untrainable, as one would expect based on the prior findings of (He & Sun, 2015; Srivastava et al., 2015; Huang et al., 2020).