Neural Tangent Kernel Analysis of Deep Narrow Neural Networks
Authors: Jongmin Lee, Joo Young Choi, Ernest K Ryu, Albert No
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we experimentally demonstrate the invariance of the scaled NTK and the trainability of deep neural networks. The code is provided as supplementary material. |
| Researcher Affiliation | Academia | 1Department of Mathematical Sciences, Seoul National University, Seoul, Korea 2Department of Electronic and Electrical Engineering, Hongik University, Seoul, Korea. |
| Pseudocode | No | The paper contains detailed mathematical derivations and proofs but does not include any pseudocode or algorithm blocks. |
| Open Source Code | Yes | The code is provided as supplementary material. |
| Open Datasets | Yes | Next, we demonstrate the empirical trainability of the deep narrow networks on the MNIST dataset. |
| Dataset Splits | No | The paper mentions training and testing on the MNIST dataset, but it does not specify any explicit validation dataset splits (e.g., percentages, sample counts, or predefined splits). |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper states that 'The code is provided as supplementary material' but does not list any specific software dependencies with version numbers. |
| Experiment Setup | Yes | We train L-layer MLPs with din = 784 and dout = 10 using the quadratic loss with one-hot vectors as targets. To establish a point of comparison, we attempt to train a 1000-layer MLP with the typical Kaiming He uniform initialization (He et al., 2015). We tuned the learning rate via a grid search from 0.00001 to 1.0, but the network was untrainable, as one would expect based on the prior findings of (He & Sun, 2015; Srivastava et al., 2015; Huang et al., 2020). |