Computationally Efficient Nyström Approximation using Fast Transforms
Authors: Si Si, Cho-Jui Hsieh, Inderjit Dhillon
ICML 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In Section 5, we show the experimental results. Figure 3. Low-rank kernel approximation results. x-axis is the time and y axis shows the relative kernel approximation error. Table 3. Data set statistics. Table 4. Comparison of kernel SVM prediction on four real-world datasets. |
| Researcher Affiliation | Academia | Department of Computer Science, Unigersity of Texas at Austin; Departments of Statistics and Computer Science, University of California at Davis |
| Pseudocode | Yes | Algorithm 1 Fast Transforms for Nystr om Approximation |
| Open Source Code | No | The paper does not provide any explicit statement or link for open-source code availability for the methodology described. |
| Open Datasets | Yes | MNIST dataset with 60,000 samples; webspam data (more than 300,000 data points); Table 3. Data set statistics (n: number of samples; d: dimension of samples). Dataset n d USPS 9298 256 Covtype 581,012 54 a9a 48,842 123 MNIST 60,000 784 Letter 18,000 16 CIFAR 60,000 400 Epsilon 25,000 2,000 webspam 350,000 254 |
| Dataset Splits | No | The paper mentions 'train', 'validation', and 'test' in Table 1 but does not specify explicit dataset splits (e.g., percentages or sample counts) for training, validation, and testing sets used in the experiments. It refers to 'validation' in the context of the pseudo-inverse calculation, not as a dataset split for model evaluation. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details with version numbers. |
| Experiment Setup | Yes | The degree p is set to be 3 in the experiment. In practice, the algorithm usually converges to a reasonably good solution in 10 iterations, so we fix the number of iterations to be 10 for all the experiments. To further improve the speed, in the experiments, we randomly sample 2000 data points to learn the seeds. |