Recursive Sampling for the Nystrom Method
Authors: Cameron Musco, Christopher Musco
NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirically we show that it finds more accurate kernel approximations in less time than popular techniques such as classic Nyström approximation and the random Fourier features method. We conclude with an empirical evaluation of our recursive RLS-Nyström method. |
| Researcher Affiliation | Academia | Cameron Musco MIT EECS cnmusco@mit.edu Christopher Musco MIT EECS cpmusco@mit.edu |
| Pseudocode | Yes | Algorithm 1 RLS-NYSTRÖM SAMPLING input: x1, . . . , xn ∈ X, kernel matrix K, ridge parameter λ > 0, failure probability δ ∈ (0, 1/8) ... Algorithm 2 RECURSIVERLS-NYSTRÖM. input: x1, . . . , xm ∈ X, kernel function K : X X ! R, ridge λ > 0, failure prob. δ ∈ (0, 1/32) |
| Open Source Code | No | The paper does not explicitly state that source code for the described methodology is being released or provide a link to a repository. It mentions using existing implementations like WEKA, scikit-learn, and IBM Libskylark, but not code for their own method. |
| Open Datasets | Yes | We evaluate RLS-Nyström on the Year Prediction MSD, Covertype, Cod-RNA, and Adult datasets downloaded from the UCI ML Repository [Lic13] and [UKM06]. |
| Dataset Splits | No | The paper mentions "training points" and "cross validation" but does not specify exact training, validation, or test splits (e.g., 80/10/10 split percentages or sample counts) for reproducibility. |
| Hardware Specification | No | The paper mentions runtime in seconds and general comparisons but does not specify any particular hardware used for the experiments, such as GPU models, CPU models, or cloud computing instances. |
| Software Dependencies | No | The paper mentions using a Gaussian kernel and that "WEKA data mining software," "scikit-learn," and "IBM Libskylark" are widely implemented, but it does not specify any software names with version numbers required to reproduce the experiments. |
| Experiment Setup | Yes | We use a variant of Algorithm 2 where, instead of choosing a regularization parameter λ, the user sets a sample size s and λ is automatically determined such that s = O(dλeff/δ)). ... We use a Gaussian kernel for all tests, with the width parameter σ selected via cross validation on regression and classification tasks. |