Nearly-Linear Time and Streaming Algorithms for Outlier-Robust PCA
Authors: Ilias Diakonikolas, Daniel Kane, Ankit Pensia, Thanasis Pittas
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | Our main contribution is to develop a nearly linear time algorithm for robust PCA with near-optimal error guarantees. We also develop a single-pass streaming algorithm for robust PCA with memory usage nearly-linear in the dimension. ... Theorem 1.2. Let ϵ (0, c) for a small constant c > 0. Let D be a subgaussian distribution with mean zero and (unknown) covariance matrix Σ. ... There exists an algorithm that takes as inputs S and ϵ, runs in e O(nd/ϵ2) time and with probability at least 0.99 outputs a unit vector u such that u Σu (1 O(ϵ log(1/ϵ))) Σ op. |
| Researcher Affiliation | Academia | 1University of Wisconsin-Madison 2University of California, San Diego. Correspondence to: Ankit Pensia <ankitp@cs.wisc.edu>, Thanasis Pittas <pittas@wisc.edu>. |
| Pseudocode | Yes | Algorithm 1 HARDTHRESHOLDINGFILTER; Algorithm 2 SAMPLETOPEIGENVECTOR; Algorithm 3 Robust PCA in Small Number of Iterations; Algorithm 4 HARDTHRESHOLDINGFILTER; Algorithm 5 ROBUSTPCA with improved runtime; Algorithm 6 Estimator of Bpk,t from minibatches; Algorithm 7 ROBUSTPCA in streaming model; Algorithm 8 SAMPLETOPEIGENVECTOR 2. |
| Open Source Code | No | The paper does not provide any explicit statements about releasing source code or links to a code repository for the described methodology. |
| Open Datasets | No | The paper focuses on theoretical analysis of algorithms for robust PCA, using abstract statistical models and distributions (e.g., 'subgaussian distribution', 'ϵ-corrupted set of samples', 'P = (1 ϵ)G + ϵB') rather than specific, publicly available datasets for empirical evaluation. |
| Dataset Splits | No | The paper is theoretical and does not describe empirical experiments with dataset splits. |
| Hardware Specification | No | The paper is theoretical and does not describe empirical experiments or the specific hardware used. |
| Software Dependencies | No | The paper is theoretical and does not describe empirical experiments or specific software dependencies with version numbers. |
| Experiment Setup | No | The paper is theoretical and does not describe an experimental setup or hyperparameter details for empirical evaluation. |