Scaling Neural Tangent Kernels via Sketching and Random Features

Authors: Amir Zandieh, Insu Han, Haim Avron, Neta Shoham, Chaewon Kim, Jinwoo Shin

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We benchmark our methods on various large-scale regression and classification tasks and show that a linear regressor trained on our CNTK features matches the accuracy of exact CNTK on CIFAR-10 dataset while achieving 150 speedup.
Researcher Affiliation Academia Amir Zandieh Max-Planck-Institut für Informatik azandieh@mpi-inf.mpg.de Insu Han Yale University insu.han@yale.edu Haim Avron Tel Aviv University haimav@tauex.tau.ac.il Neta Shoham Tel Aviv University shohamne@gmail.com Chaewon Kim KAIST chaewonk@kaist.ac.kr Jinwoo Shin KAIST jinwoos@kaist.ac.kr
Pseudocode Yes Algorithm 1 NTKSKETCH for fully-connected Re LU networks ... Algorithm 2 Random Features for Re LU NTK via POLYSKETCH
Open Source Code Yes Codes are available at https://github.com/insuhan/ntk-sketch-rf.
Open Datasets Yes We first benchmark our proposed NTK approximation algorithms on MNIST [25] dataset and compare against gradient-based NTK random features [5] (GRADRF) as a baseline method. Next we test our CNTKSKETCH on CIFAR-10 dataset [24]. We also demonstrate the computational efficiency of our NTKSKETCH and NTKRF using 4 largescale UCI regression datasets [17].
Dataset Splits No We search the ridge parameter with a random subset of training set and choose the one that achieves the best validation accuracy. The paper mentions using a "random subset of training set" for validation but does not provide specific details on the size or methodology of this subset for reproducibility.
Hardware Specification Yes We run experiments on a system with an Intel E5-2630 CPU with 256 GB RAM and a single Ge Force RTX 2080 GPUs with 12 GB RAM.
Software Dependencies No The paper mentions that "Codes are available at https://github.com/insuhan/ntk-sketch-rf" but does not explicitly list specific software dependencies with version numbers within the text.
Experiment Setup Yes We use the Re LU network with depth L = 1. We search the ridge parameter with a random subset of training set and choose the one that achieves the best validation accuracy. We choose a convolutional network of depth L = 3 and compare CNTKSKETCH and GRADRF for various feature dimensions. For our methods and RFF, we fix the output dimension to m = 8,192 for all datasets.