reproducibilityindex.ai

Nearly-Linear Time and Streaming Algorithms for Outlier-Robust PCA

Authors: Ilias Diakonikolas, Daniel Kane, Ankit Pensia, Thanasis Pittas

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	Our main contribution is to develop a nearly linear time algorithm for robust PCA with near-optimal error guarantees. We also develop a single-pass streaming algorithm for robust PCA with memory usage nearly-linear in the dimension. ... Theorem 1.2. Let ϵ (0, c) for a small constant c > 0. Let D be a subgaussian distribution with mean zero and (unknown) covariance matrix Σ. ... There exists an algorithm that takes as inputs S and ϵ, runs in e O(nd/ϵ2) time and with probability at least 0.99 outputs a unit vector u such that u Σu (1 O(ϵ log(1/ϵ))) Σ op.
Researcher Affiliation	Academia	1University of Wisconsin-Madison 2University of California, San Diego. Correspondence to: Ankit Pensia <ankitp@cs.wisc.edu>, Thanasis Pittas <pittas@wisc.edu>.
Pseudocode	Yes	Algorithm 1 HARDTHRESHOLDINGFILTER; Algorithm 2 SAMPLETOPEIGENVECTOR; Algorithm 3 Robust PCA in Small Number of Iterations; Algorithm 4 HARDTHRESHOLDINGFILTER; Algorithm 5 ROBUSTPCA with improved runtime; Algorithm 6 Estimator of Bpk,t from minibatches; Algorithm 7 ROBUSTPCA in streaming model; Algorithm 8 SAMPLETOPEIGENVECTOR 2.
Open Source Code	No	The paper does not provide any explicit statements about releasing source code or links to a code repository for the described methodology.
Open Datasets	No	The paper focuses on theoretical analysis of algorithms for robust PCA, using abstract statistical models and distributions (e.g., 'subgaussian distribution', 'ϵ-corrupted set of samples', 'P = (1 ϵ)G + ϵB') rather than specific, publicly available datasets for empirical evaluation.
Dataset Splits	No	The paper is theoretical and does not describe empirical experiments with dataset splits.
Hardware Specification	No	The paper is theoretical and does not describe empirical experiments or the specific hardware used.
Software Dependencies	No	The paper is theoretical and does not describe empirical experiments or specific software dependencies with version numbers.
Experiment Setup	No	The paper is theoretical and does not describe an experimental setup or hyperparameter details for empirical evaluation.