Quantum-Inspired Algorithms from Randomized Numerical Linear Algebra

Authors: Nadiia Chepurko, Kenneth Clarkson, Lior Horesh, Honghao Lin, David Woodruff

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments demonstrate that the proposed data structures also work well on real-world datasets.
Researcher Affiliation Collaboration 1Department of Computer Science, Massachusetts Institute of Technology, Cambridge, MA, USA 2IBM Research, USA 3Department of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA.
Pseudocode Yes Algorithm 1 LENSQSAMPLE(DS, S = null, m S, m R), Algorithm 2 RIDGEREGDYN(DS, B, ˆσk, ˆσ1, ε, λ), Algorithm 3 BUILDLOWRANKFACTORS (DYNSAMPLER, k, ˆσk, ˆσk, ε, τ), Algorithm 4 MATVECSAMPLER(A, SAMPLER, W, v, ν), Algorithm 5 LEVSAMPLE(A, µs, f())
Open Source Code No The paper does not provide any links to its own open-source code or explicitly state that its code is available.
Open Datasets Yes KOS data.2 A word frequency dataset. The matrix represents word frequencies in blogs and has dimensions 3430 6906 with 353160 non-zero entries. Movie Lens 100K. (Harper & Konstan, 2016) A movie ratings dataset, which consists of a preference matrix with 100,000 ratings from 611 users across 9,724 movies. Year Prediction.3 A dataset that collects 515345 songs and each song has 90 attributes. The task here is to predict the release year of the song. A R515345 90, B R515345 1. PEMS data. 4 The data describes the occupancy rate of different car lanes of San Francisco bay area freeways. Each row is the time series for a single day.
Dataset Splits No The paper mentions setting parameters like 'sampled rows and columns to be (r, c)' but does not provide specific training, validation, or test dataset splits (e.g., percentages or counts).
Hardware Specification Yes All of our experiments were done in Python and conducted on a laptop with a 1.90GHz CPU and 16GB RAM.
Software Dependencies No The paper states that "All of our experiments were done in Python" but does not specify the version of Python or any other software dependencies with version numbers.
Experiment Setup Yes For the KOS dataset, we set the number of sampled rows and columns to be (r, c) = (500, 700) for both algorithms. For the Movie Lens dataset we set (r, c) = (300, 500). We define the error ε = A Y F / A Ak F 1, where Y is the algorithm s output and Ak is the best k-rank approximation. Since the regime of interest is k n, we vary k among {10, 15, 20}. ... We set A R7000 9000, B R7000 1. ... We set the number of sampled rows and columns to be r and c. For the Year Prediction data, the number of columns is small, and hence we only do row sampling, and likewise, for the PEMS data, we only do column sampling.