Fourier Sparse Leverage Scores and Approximate Kernel Learning
Authors: Tamas Erdelyi, Cameron Musco, Christopher Musco
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We study a 2-D Gaussian process regression problem, representative of typical data-intensive function interpolation tasks, showing that our oblivious sketching method substantially improves on the original random Fourier features method on which it is based [RR07]. We compare our method against the classical RFF method on a kernel ridge regression problem involving precipitation data from Slovakia [NM13], a benchmark GIS data set. See Figure 3 for a description. The regression solution requires computing (K + λI) 1y, where y is a vector of training data. Doing so with a direct method is slow since K is large and dense, so an iterative solver is necessary. However, when cross validation is used to choose a kernel width σ and regularization parameter λ, the optimal choices lead to a poorly conditioned system, which leads to slow convergence. Results on preconditioning are shown in Figure 4. |
| Researcher Affiliation | Academia | Tamás Erdélyi Texas A&M University terdelyi@math.tamu.edu Cameron Musco University of Mass. Amherst cmusco@cs.umass.edu Christopher Musco New York University cmusco@nyu.edu |
| Pseudocode | No | The paper does not contain any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper discusses implementing the method and its simplifications but does not provide a link to open-source code or explicitly state that the code for the described methodology is released. |
| Open Datasets | Yes | We compare our method against the classical RFF method on a kernel ridge regression problem involving precipitation data from Slovakia [NM13], a benchmark GIS data set. |
| Dataset Splits | Yes | Our goal is to approximate this precipitation function based on 6400 training samples from randomly selected locations (visualized as black dots)... when cross validation is used to choose a kernel width σ and regularization parameter λ, the optimal choices lead to a poorly conditioned system, which leads to slow convergence. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions 'sklearn [PVG+11]' and that the method 'can be implemented in a few lines of code', but it does not specify version numbers for any software dependencies. |
| Experiment Setup | No | The paper mentions using '6400 training samples' and that 'cross validation is used to choose a kernel width σ and regularization parameter λ', but it does not provide specific hyperparameter values (e.g., exact σ, λ values used, learning rates, batch sizes) or detailed system-level training configurations to ensure reproducibility. |