Robust Sub-Gaussian Principal Component Analysis and Width-Independent Schatten Packing
Authors: Arun Jambulapati, Jerry Li, Kevin Tian
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We develop two methods for the following fundamental statistical task: given an ϵ-corrupted set of n samples from a d-dimensional sub-Gaussian distribution, return an approximate top eigenvector of the covariance matrix. Our first robust PCA algorithm runs in polynomial time, returns a 1 O(ϵ log ϵ 1)-approximate top eigenvector, and is based on a simple iterative filtering approach. Our second, which attains a slightly worse approximation factor, runs in nearly-linear time and sample complexity under a mild spectral gap assumption. These are the first polynomial-time algorithms yielding non-trivial information about the covariance of a corrupted sub-Gaussian distribution without requiring additional algebraic structure of moments. |
| Researcher Affiliation | Collaboration | Arun Jambulapati Stanford University jmblpati@stanford.edu Jerry Li Microsoft Research jerrl@microsoft.com Kevin Tian Stanford University kjtian@stanford.edu |
| Pseudocode | Yes | Algorithm 1 Packing LP(A, ϵ) ... Algorithm 2 PNorm Packing(A, ϵ, p) ... Algorithm 3 Schatten Packing({Ai}i [n], ϵ, p) ... Algorithm 4 Robust PCA({Xi}i [n], ϵ, t) ... Algorithm 6 |
| Open Source Code | No | The paper does not contain any explicit statements or links indicating that open-source code for the described methodology is provided. |
| Open Datasets | No | The paper discusses theoretical properties of algorithms for sub-Gaussian distributions and does not specify the use of any named, publicly available datasets for training. |
| Dataset Splits | No | The paper is theoretical and does not describe training, validation, or test dataset splits for empirical reproduction. |
| Hardware Specification | No | The paper focuses on theoretical algorithms and their performance guarantees, and does not mention any specific hardware used for experiments. |
| Software Dependencies | No | The paper does not specify any software dependencies with version numbers. |
| Experiment Setup | No | The paper does not provide details about experimental setup, such as hyperparameter values or training configurations, as it is a theoretical work. |