Neural Tangent Kernel Beyond the Infinite-Width Limit: Effects of Depth and Initialization
Authors: Mariia Seleznova, Gitta Kutyniok
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We provide extensive numerical experiments to verify our theoretical results. We use JAX (Bradbury et al., 2018) and Flax (neural network library for JAX) (Heek et al., 2020) to compute the NTK of fully-connected Re LU networks effortlessly. |
| Researcher Affiliation | Academia | 1Department of Mathematics, Ludwig-Maximilians-Universit at M unchen, Munich, Germany. |
| Pseudocode | No | The paper describes mathematical derivations and experimental procedures textually and through equations, but does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Source code to reproduce the presented results is available at: https://github.com/mselezniova/ntk beyond limit. |
| Open Datasets | Yes | Our experiments in Figure 4 confirm this intuition in a simple setting of fully-connected Re LU networks trained on MNIST. |
| Dataset Splits | No | The paper mentions training on MNIST but does not provide explicit details about train/validation/test splits, such as percentages, sample counts, or specific splitting methodology. |
| Hardware Specification | No | The paper does not provide specific hardware details (such as exact GPU/CPU models or memory specifications) used for running the experiments. |
| Software Dependencies | No | We use JAX (Bradbury et al., 2018) and Flax (neural network library for JAX) (Heek et al., 2020) to compute the NTK of fully-connected Re LU networks effortlessly. Specific version numbers for these software dependencies are not provided. |
| Experiment Setup | Yes | The DNNs are initialized with σ2 w {1.0, 2.0, 2.2} and trained on MNIST using Adam algorithm with learning rate 10 5. |