reproducibilityindex.ai

Neural (Tangent Kernel) Collapse

Authors: Mariia Seleznova, Dana Weitzner, Raja Giryes, Gitta Kutyniok, Hung-Hsu Chou

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We provide large-scale numerical experiments on three common DNN architectures and three benchmark datasets to support our theory.
Researcher Affiliation	Academia	1Ludwig-Maximilians-Universität München 2Tel Aviv University
Pseudocode	No	The paper contains mathematical derivations and theoretical analyses but does not include any explicitly labeled
Open Source Code	Yes	Source code to reproduce the results is available in the project s Git Hub repository.
Open Datasets	Yes	Our datasets are MNIST [35], Fashion MNIST [51] and CIFAR10 [34].
Dataset Splits	No	The paper does not provide specific details on training/validation/test dataset splits, such as percentages or sample counts. It mentions training for
Hardware Specification	Yes	We executed the numerical experiments mainly on NVIDIA Ge Force RTX 3090 Ti GPUs, each model was trained on a single GPU.
Software Dependencies	No	We use JAX [8] and Flax (neural network library for JAX) [25] to implement all the DNN architectures and the training routines. While these software components are mentioned, specific version numbers are not provided for JAX or Flax.
Experiment Setup	Yes	We use SGD with Nesterov momentum 0.9 and weight decay 5e-4. Every model is trained for 400 epochs with batches of size 120. To be consistent with the theory, we balance the batches exactly. We train every model with a set of initial learning rates spaced logarithmically in the range η [10-4, 100.25]. The learning rate is divided by 10 every 120 epochs.