reproducibilityindex.ai

Fast Neural Kernel Embeddings for General Activations

Authors: Insu Han, Amir Zandieh, Jaehoon Lee, Roman Novak, Lechao Xiao, Amin Karbasi

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we perform experiments with the proposed neural kernels based on our dual kernel approximation. All experiments run using a single A100 GPU machine. We first benchmark our algorithm to approximate the dual kernel matrix. We use Re LU, Abs (i.e., σ(t) = \|t\|), sin, Gaussian, erf and Ge LU activations and approximate them by their Hermite expansion where degree changes from q = 1 to 20. We randomly generate n = 1,000 of 256-dimensional inputs where each entry is i.i.d. drawn from N(0, 1/256). We also compare our approach to the Monte Carlo estimation of dual kernel, i.e., Kσ(x, y) ≈ 1 m Pm i=1 σ( wi, x )σ( wi, y ) where {wi}m i=1 are i.i.d. standard Gaussian vectors. In Figure 1, we plot relative errors of the Frobenius norm of kernel approximations in terms of wall-clock times (top) and polynomial degree (bottom).
Researcher Affiliation	Collaboration	Insu Han1 Amir Zandieh2 Jaehoon Lee3 Roman Novak3 Lechao Xiao3 Amin Karbasi1,3 1Yale University 2Max-Planck-Institut für Informatik 3Google Research
Pseudocode	Yes	Algorithm 1 Subspace Embedding of Homogeneous NNGP and NTK
Open Source Code	Yes	We open-source NNGP and NTK for new activations within the Neural Tangents library [42] and sketching algorithm at https://github.com/insuhan/ntk_activations.
Open Datasets	Yes	Empirically, with respect to exact convolutional NTK (CNTK) computation, our method achieves 106 speedup for approximate CNTK of a 5-layer Myrtle network on CIFAR-10 dataset.
Dataset Splits	No	The paper reports selecting 'the best test accuracy among 20 choices of ridge parameters' which implies hyperparameter tuning, but it does not specify a distinct validation dataset split. It doesn't mention explicit percentages or sample counts for a validation set.
Hardware Specification	Yes	All experiments run using a single A100 GPU machine.
Software Dependencies	No	The paper mentions 'Neural Tangents library [42]' but does not provide specific version numbers for this or any other software dependencies.
Experiment Setup	Yes	We extract CNTK features of a 5-layer convolutional neural network (known as Myrtle5 [54]) without pooling by setting degree q = 8 and explore feature dimension m = {29, . . . , 214} and homogeneous dual kernels including Re LU, ABRe LU activations as well as deep normalized Gaussian kernels with 2 scaling factors.