Recursive Sampling for the Nystrom Method

Authors: Cameron Musco, Christopher Musco

NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirically we show that it finds more accurate kernel approximations in less time than popular techniques such as classic Nyström approximation and the random Fourier features method. We conclude with an empirical evaluation of our recursive RLS-Nyström method.
Researcher Affiliation Academia Cameron Musco MIT EECS cnmusco@mit.edu Christopher Musco MIT EECS cpmusco@mit.edu
Pseudocode Yes Algorithm 1 RLS-NYSTRÖM SAMPLING input: x1, . . . , xn ∈ X, kernel matrix K, ridge parameter λ > 0, failure probability δ ∈ (0, 1/8) ... Algorithm 2 RECURSIVERLS-NYSTRÖM. input: x1, . . . , xm ∈ X, kernel function K : X X ! R, ridge λ > 0, failure prob. δ ∈ (0, 1/32)
Open Source Code No The paper does not explicitly state that source code for the described methodology is being released or provide a link to a repository. It mentions using existing implementations like WEKA, scikit-learn, and IBM Libskylark, but not code for their own method.
Open Datasets Yes We evaluate RLS-Nyström on the Year Prediction MSD, Covertype, Cod-RNA, and Adult datasets downloaded from the UCI ML Repository [Lic13] and [UKM06].
Dataset Splits No The paper mentions "training points" and "cross validation" but does not specify exact training, validation, or test splits (e.g., 80/10/10 split percentages or sample counts) for reproducibility.
Hardware Specification No The paper mentions runtime in seconds and general comparisons but does not specify any particular hardware used for the experiments, such as GPU models, CPU models, or cloud computing instances.
Software Dependencies No The paper mentions using a Gaussian kernel and that "WEKA data mining software," "scikit-learn," and "IBM Libskylark" are widely implemented, but it does not specify any software names with version numbers required to reproduce the experiments.
Experiment Setup Yes We use a variant of Algorithm 2 where, instead of choosing a regularization parameter λ, the user sets a sample size s and λ is automatically determined such that s = O(dλeff/δ)). ... We use a Gaussian kernel for all tests, with the width parameter σ selected via cross validation on regression and classification tasks.