Fourier Sparse Leverage Scores and Approximate Kernel Learning

Authors: Tamas Erdelyi, Cameron Musco, Christopher Musco

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We study a 2-D Gaussian process regression problem, representative of typical data-intensive function interpolation tasks, showing that our oblivious sketching method substantially improves on the original random Fourier features method on which it is based [RR07]. We compare our method against the classical RFF method on a kernel ridge regression problem involving precipitation data from Slovakia [NM13], a benchmark GIS data set. See Figure 3 for a description. The regression solution requires computing (K + λI) 1y, where y is a vector of training data. Doing so with a direct method is slow since K is large and dense, so an iterative solver is necessary. However, when cross validation is used to choose a kernel width σ and regularization parameter λ, the optimal choices lead to a poorly conditioned system, which leads to slow convergence. Results on preconditioning are shown in Figure 4.
Researcher Affiliation Academia Tamás Erdélyi Texas A&M University terdelyi@math.tamu.edu Cameron Musco University of Mass. Amherst cmusco@cs.umass.edu Christopher Musco New York University cmusco@nyu.edu
Pseudocode No The paper does not contain any clearly labeled pseudocode or algorithm blocks.
Open Source Code No The paper discusses implementing the method and its simplifications but does not provide a link to open-source code or explicitly state that the code for the described methodology is released.
Open Datasets Yes We compare our method against the classical RFF method on a kernel ridge regression problem involving precipitation data from Slovakia [NM13], a benchmark GIS data set.
Dataset Splits Yes Our goal is to approximate this precipitation function based on 6400 training samples from randomly selected locations (visualized as black dots)... when cross validation is used to choose a kernel width σ and regularization parameter λ, the optimal choices lead to a poorly conditioned system, which leads to slow convergence.
Hardware Specification No The paper does not provide any specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies No The paper mentions 'sklearn [PVG+11]' and that the method 'can be implemented in a few lines of code', but it does not specify version numbers for any software dependencies.
Experiment Setup No The paper mentions using '6400 training samples' and that 'cross validation is used to choose a kernel width σ and regularization parameter λ', but it does not provide specific hyperparameter values (e.g., exact σ, λ values used, learning rates, batch sizes) or detailed system-level training configurations to ensure reproducibility.