Neural Kernels Without Tangents
Authors: Vaishaal Shankar, Alex Fang, Wenshuo Guo, Sara Fridovich-Keil, Jonathan Ragan-Kelley, Ludwig Schmidt, Benjamin Recht
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimentally, we show a correlation in test error between neural network architectures and the associated kernels. We construct a simple neural network architecture using only 3 3 convolutions, 2 2 average pooling, Re LU, and optimized with SGD and MSE loss that achieves 96% accuracy on CIFAR10, and whose corresponding compositional kernel achieves 90% accuracy. |
| Researcher Affiliation | Academia | 1University of California, Berkeley 2Massachusetts Institute of Technology. |
| Pseudocode | Yes | Algorithm 1 Compositional Kernel |
| Open Source Code | No | The paper does not include an unambiguous statement that the authors are releasing their code for the described methodology, nor does it provide a direct link to a source code repository. |
| Open Datasets | Yes | We then present comparison results between neural networks, NTKs, and compositional kernels on a variety of datasets, including MNIST, CIFAR-10 (Krizhevsky (2009)), CIFAR-10.1 (Recht et al. (2019)), CIFAR-100 (Krizhevsky (2009)) and 90 UCI datasets (Fern andez-Delgado et al. (2014)). |
| Dataset Splits | Yes | Table 3 compares the performance of neural networks with various depths and their corresponding compositional kernels on both the 10,000 test images from CIFAR-10 and the additional 2,000 harder test images from CIFAR-10.1. We compute the optimal hyperparameters for each dataset (for both NTK and Gaussian kernel) by averaging performance over four cross-validation folds. |
| Hardware Specification | Yes | We implemented all the convolutional kernels in the tensor comprehensions framework (Vasilache et al., 2018) and executed them on V100 GPUs using Amazon Web Services (AWS) P3.16xlarge instances. |
| Software Dependencies | No | The paper mentions implementing kernels in the 'tensor comprehensions framework' and refers to a publication (Vasilache et al., 2018), but it does not provide specific version numbers for this framework or any other software dependencies crucial for replication. |
| Experiment Setup | Yes | We train all the Myrtle CNNs on CIFAR-10 using SGD and the mean squared error (MSE) loss with multi-step learning rate decay. The exact hyperparameters are provided in the appendix. |