Optimal Analysis of Subset-Selection Based L_p Low-Rank Approximation
Authors: Chen Dan, Hong Wang, Hongyang Zhang, Yuchen Zhou, Pradeep K. Ravikumar
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We study the low rank approximation problem of any given matrix A over Rn m and Cn m in entry-wise ℓp loss, that is, finding a rank-k matrix X such that A X p is minimized. Unlike the traditional ℓ2 setting, this particular variant is NP-Hard. We show that the algorithm of column subset selection, which was an algorithmic foundation of many existing algorithms, enjoys approximation ratio (k + 1)1/p for 1 p 2 and (k + 1)1 1/p for p 2. This improves upon the previous O(k + 1) bound for p 1 [1]. We complement our analysis with lower bounds; these bounds match our upper bounds up to constant 1 when p 2. At the core of our techniques is an application of Riesz-Thorin interpolation theorem from harmonic analysis, which might be of independent interest to other algorithmic designs and analysis more broadly. |
| Researcher Affiliation | Academia | Chen Dan Carnegie Mellon University cdan@cs.cmu.edu Hong Wang Princeton University Hong.Wang1991@gmail.com Hongyang Zhang Toyota Technological Institute at Chicago honyanz@ttic.edu Yuchen Zhou University of Wisconsin, Madison yuchenzhou@stat.wisc.edu Pradeep Ravikumar Carnegie Mellon University pradeepr@cs.cmu.edu |
| Pseudocode | Yes | Algorithm 1 A cp,k approximation to problem (1) by column subset selection. Algorithm 2 [1] SELECTCOLUMNS (k, A): Selecting O(k log m) columns of A. Algorithm 3 [1]An algorithm that transforms an O(k log m)-rank matrix factorization into a k-rank matrix factorization without inflating the error too much. |
| Open Source Code | No | The paper does not provide any statement or link regarding the availability of open-source code for the methodology described. |
| Open Datasets | No | The paper is theoretical and does not conduct experiments with specific datasets. Therefore, no information about publicly available training datasets or access is provided. |
| Dataset Splits | No | The paper is theoretical and does not conduct experiments with dataset splits. Therefore, no information about validation splits is provided. |
| Hardware Specification | No | The paper is theoretical and does not describe any experiments. Therefore, no hardware specifications for running experiments are mentioned. |
| Software Dependencies | No | The paper is theoretical and does not describe any experiments requiring specific software dependencies with version numbers for reproducibility. |
| Experiment Setup | No | The paper is theoretical and does not describe any experiments or their setup. Therefore, no specific experimental setup details like hyperparameters or training configurations are provided. |