Few-Shot Data-Driven Algorithms for Low Rank Approximation

Authors: Piotr Indyk, Tal Wagner, David Woodruff

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments show that our algorithms, either by themselves or in combination with previous methods, achieve significant empirical advantages over previous work, improving training times by up to an order of magnitude toward achieving the same target accuracy.
Researcher Affiliation Collaboration Piotr Indyk MIT indyk@mit.edu Tal Wagner Microsoft Research Redmond tal.wagner@gmail.com David P. Woodruff Carnegie Mellon University dwoodruf@cs.cmu.edu
Pseudocode Yes Algorithm 1: Sketch-based matrix low-rank approximation; Algorithm 2: Closed-form computation of a sketching matrix from a training matrix
Open Source Code No The paper does not provide any links to open-source code or state that the code is publicly available.
Open Datasets Yes For a direct comparison, we use three datasets from [33], with the same train/test partitions Logo, Eagle and Hyper [32]. All are publicly available.
Dataset Splits No The paper mentions "train/test partitions" but does not specify a separate validation split or how it was handled.
Hardware Specification Yes Experiments were run on an NVIDIA Ge Force RTX 2080 TI graphics card.
Software Dependencies No The paper mentions "Py Torch" in the context of IVY but does not provide specific version numbers for any software dependencies used in their own experiments.
Experiment Setup Yes To test our few-shot algorithms, we evaluate 1Shot2Vec with one randomly chosen training matrix, and Few Shot SGD with either 2 or 3 randomly chosen training matrices (that is, 2 or 3 steps of SGD). In addition, we incorporate our approach into a many-shot algorithm, by initializing the sketching matrix of IVY using our method 1Shot1Vec instead of a random sign matrix. ... The results are reported in Figure 1 and in Table 1. We do not include Greedy in these results, since its training time is much longer than the other methods (35 40 minutes per matrix, versus up to 4 minutes for the other methods).