Sublinear Time Low-Rank Approximation of Distance Matrices

Authors: Ainesh Bakshi, David Woodruff

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, we empirically compare our algorithm with the SVD and input sparsity time algorithms. Our algorithm is several hundred times faster than the SVD, and about 8-20 times faster than input sparsity methods on real-world and and synthetic datasets of size 108. Accuracy-wise, our algorithm is only slightly worse than that of the SVD (optimal) and input-sparsity time algorithms.
Researcher Affiliation Academia Ainesh Bakshi Department of Computer Science Carnegie Mellon University Pittsburgh, PA 15213 abakshi@cs.cmu.edu David P. Woodruff Department of Computer Science Carnegie Mellon University Pittsburgh, PA 15213 dwoodruf@cs.cmu.edu
Pseudocode Yes Algorithm 1 : Row Norm Estimation. Algorithm 2 : First Sublinear Time Algorithm.
Open Source Code No The paper does not provide any statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets Yes The second dataset we use is the popular MNIST dataset which is a collection of 70000 handwritten characters but sample 10000 points from it.
Dataset Splits No The paper describes generating synthetic data and sampling points from MNIST, but does not provide specific details about train/validation/test splits.
Hardware Specification Yes The experiments are run on a Macbook Pro 2017 with 16GB RAM, 2.8GHz quad-core 7th-generation Intel Core i7 processor.
Software Dependencies No The paper mentions software packages like "numpy's linear algebra package", "scipy's sparse linear algebra package", and "scikit-learn" but does not provide specific version numbers for these dependencies.
Experiment Setup Yes We generate 10000 points with 200 features split up into 20 clusters. ... sample 10000 points from it. ... Running Time (in seconds) on the Clustering Dataset for Rank = 20 ... Running Time (in seconds) on the MNIST Dataset for Rank = 40