Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Scaling Neural Tangent Kernels via Sketching and Random Features

Authors: Amir Zandieh, Insu Han, Haim Avron, Neta Shoham, Chaewon Kim, Jinwoo Shin

NeurIPS 2021 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We benchmark our methods on various large-scale regression and classification tasks and show that a linear regressor trained on our CNTK features matches the accuracy of exact CNTK on CIFAR-10 dataset while achieving 150 speedup.
Researcher Affiliation Academia Amir Zandieh Max-Planck-Institut für Informatik EMAIL Insu Han Yale University EMAIL Haim Avron Tel Aviv University EMAIL Neta Shoham Tel Aviv University EMAIL Chaewon Kim KAIST EMAIL Jinwoo Shin KAIST EMAIL
Pseudocode Yes Algorithm 1 NTKSKETCH for fully-connected Re LU networks ... Algorithm 2 Random Features for Re LU NTK via POLYSKETCH
Open Source Code Yes Codes are available at https://github.com/insuhan/ntk-sketch-rf.
Open Datasets Yes We first benchmark our proposed NTK approximation algorithms on MNIST [25] dataset and compare against gradient-based NTK random features [5] (GRADRF) as a baseline method. Next we test our CNTKSKETCH on CIFAR-10 dataset [24]. We also demonstrate the computational efficiency of our NTKSKETCH and NTKRF using 4 largescale UCI regression datasets [17].
Dataset Splits No We search the ridge parameter with a random subset of training set and choose the one that achieves the best validation accuracy. The paper mentions using a "random subset of training set" for validation but does not provide specific details on the size or methodology of this subset for reproducibility.
Hardware Specification Yes We run experiments on a system with an Intel E5-2630 CPU with 256 GB RAM and a single Ge Force RTX 2080 GPUs with 12 GB RAM.
Software Dependencies No The paper mentions that "Codes are available at https://github.com/insuhan/ntk-sketch-rf" but does not explicitly list specific software dependencies with version numbers within the text.
Experiment Setup Yes We use the Re LU network with depth L = 1. We search the ridge parameter with a random subset of training set and choose the one that achieves the best validation accuracy. We choose a convolutional network of depth L = 3 and compare CNTKSKETCH and GRADRF for various feature dimensions. For our methods and RFF, we fix the output dimension to m = 8,192 for all datasets.