Neural Tangent Kernel Beyond the Infinite-Width Limit: Effects of Depth and Initialization

Authors: Mariia Seleznova, Gitta Kutyniok

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We provide extensive numerical experiments to verify our theoretical results. We use JAX (Bradbury et al., 2018) and Flax (neural network library for JAX) (Heek et al., 2020) to compute the NTK of fully-connected Re LU networks effortlessly.
Researcher Affiliation Academia 1Department of Mathematics, Ludwig-Maximilians-Universit at M unchen, Munich, Germany.
Pseudocode No The paper describes mathematical derivations and experimental procedures textually and through equations, but does not include any structured pseudocode or algorithm blocks.
Open Source Code Yes Source code to reproduce the presented results is available at: https://github.com/mselezniova/ntk beyond limit.
Open Datasets Yes Our experiments in Figure 4 confirm this intuition in a simple setting of fully-connected Re LU networks trained on MNIST.
Dataset Splits No The paper mentions training on MNIST but does not provide explicit details about train/validation/test splits, such as percentages, sample counts, or specific splitting methodology.
Hardware Specification No The paper does not provide specific hardware details (such as exact GPU/CPU models or memory specifications) used for running the experiments.
Software Dependencies No We use JAX (Bradbury et al., 2018) and Flax (neural network library for JAX) (Heek et al., 2020) to compute the NTK of fully-connected Re LU networks effortlessly. Specific version numbers for these software dependencies are not provided.
Experiment Setup Yes The DNNs are initialized with σ2 w {1.0, 2.0, 2.2} and trained on MNIST using Adam algorithm with learning rate 10 5.