reproducibilityindex.ai

When Do Neural Networks Outperform Kernel Methods?

Authors: Behrooz Ghorbani, Song Mei, Theodor Misiakiewicz, Andrea Montanari

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In Figure 1 we carry out such an experiment using Fashion MNIST (FMNIST) data (d = 784, n = 60000, 10 classes). We compare two-layers NN with the RF and NT models. We choose the architectures of NN, NT, RF as to match the number of parameters: namely we used N = 4096 for NN and NT and N = 321126 for RF. We also ﬁt the corresponding RKHS models (corresponding to N = ) using kernel ridge regression (KRR), and two simple polynomial models: fℓ(x) = Pℓ k=0 Bk, x k , for ℓ {1, 2}. In the unperturbed dataset, all of these approaches have comparable accuracies (except the linear ﬁt). As noise is added, RF, NT, and RKHS methods deteriorate rapidly. While the accuracy of NN decreases as well, it signiﬁcantly outperforms other methods.
Researcher Affiliation	Collaboration	Department of Electrical Engineering, Stanford University Department of Statistics, University of California, Berkeley Department of Statistics, Stanford University Google Research, Brain Team
Pseudocode	No	The paper describes models and theoretical results but does not provide any pseudocode or algorithm blocks.
Open Source Code	Yes	The code used to produce our results can be accessed at https://github.com/b Ghorbani/ linearized_neural_networks.
Open Datasets	Yes	In Figure 1 we carry out such an experiment using Fashion MNIST (FMNIST) data (d = 784, n = 60000, 10 classes).
Dataset Splits	No	The paper mentions training on FMNIST and CIFAR-10 data but does not explicitly state the train/validation/test splits or ratios.
Hardware Specification	No	The paper does not provide any specific details about the hardware used to run the experiments (e.g., GPU models, CPU types, or cloud computing instances).
Software Dependencies	No	The paper mentions using "Re LU activations" but does not specify any software names with version numbers (e.g., Python, TensorFlow, PyTorch versions) that would be needed for replication.
Experiment Setup	Yes	We choose the architectures of NN, NT, RF as to match the number of parameters: namely we used N = 4096 for NN and NT and N = 321126 for RF. We also ﬁt the corresponding RKHS models (corresponding to N = ) using kernel ridge regression (KRR), and two simple polynomial models: fℓ(x) = Pℓ k=0 Bk, x k , for ℓ {1, 2}. Throughout we use Re LU activations.