Characterizing the spectrum of the NTK via a power series expansion
Authors: Michael Murray, Hui Jin, Benjamin Bowman, Guido Montufar
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In Figure 1 we empirically validate our theory by computing the spectrum of the NTK on both Caltech101 (Li et al., 2022) and isotropic Gaussian data for feedforward networks. |
| Researcher Affiliation | Academia | Department of Mathematics, UCLA, CA, USA Department of Statistics, UCLA, CA, USA Max Planck Institute for Mathematics in the Sciences, Leipzig, Germany [mmurray,huijin,benbowman314,montufar]@math.ucla.edu |
| Pseudocode | No | No structured pseudocode or algorithm blocks were found in the paper. |
| Open Source Code | Yes | Reproducibility Statement: To ensure reproducibility, we make the code public at https://github.com/bbowman223/data_ntk. |
| Open Datasets | Yes | In Figure 1 we empirically validate our theory by computing the spectrum of the NTK on both Caltech101 (Li et al., 2022) and isotropic Gaussian data for feedforward networks. |
| Dataset Splits | No | The paper mentions using a 'batch size of n = 200' and plotting 'the first 100 eigenvalues' but does not provide specific details on how the datasets (Caltech101, isotropic Gaussian data) were split into training, validation, or test sets. |
| Hardware Specification | No | The paper states 'We use the functorch module in Py Torch (Paszke et al., 2019)' and mentions running experiments, but does not provide specific details on the hardware (e.g., GPU/CPU models, memory) used for these experiments. |
| Software Dependencies | No | The paper mentions 'functorch module in Py Torch (Paszke et al., 2019)' but does not provide specific version numbers for these or any other software dependencies needed for replication. |
| Experiment Setup | Yes | For the feedforward architectures we consider networks of depth 2 and 5 with the width of all layers being set at 500. With regard to the activation function we test linear, Re LU and Tanh, and in terms of initialization we use Kaiming uniform (He et al., 2015)... For the convolutional architectures we again consider depths 2 and 5, with each layer consisting of 100 channels with the filter size set to 5x5... The batch sized is fixed at 200 and we plot only the first 100 normalized eigenvalues. Each experiment was repeated 10 times. |