reproducibilityindex.ai

Robust Sub-Gaussian Principal Component Analysis and Width-Independent Schatten Packing

Authors: Arun Jambulapati, Jerry Li, Kevin Tian

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	We develop two methods for the following fundamental statistical task: given an ϵ-corrupted set of n samples from a d-dimensional sub-Gaussian distribution, return an approximate top eigenvector of the covariance matrix. Our ﬁrst robust PCA algorithm runs in polynomial time, returns a 1 O(ϵ log ϵ 1)-approximate top eigenvector, and is based on a simple iterative ﬁltering approach. Our second, which attains a slightly worse approximation factor, runs in nearly-linear time and sample complexity under a mild spectral gap assumption. These are the ﬁrst polynomial-time algorithms yielding non-trivial information about the covariance of a corrupted sub-Gaussian distribution without requiring additional algebraic structure of moments.
Researcher Affiliation	Collaboration	Arun Jambulapati Stanford University jmblpati@stanford.edu Jerry Li Microsoft Research jerrl@microsoft.com Kevin Tian Stanford University kjtian@stanford.edu
Pseudocode	Yes	Algorithm 1 Packing LP(A, ϵ) ... Algorithm 2 PNorm Packing(A, ϵ, p) ... Algorithm 3 Schatten Packing({Ai}i [n], ϵ, p) ... Algorithm 4 Robust PCA({Xi}i [n], ϵ, t) ... Algorithm 6
Open Source Code	No	The paper does not contain any explicit statements or links indicating that open-source code for the described methodology is provided.
Open Datasets	No	The paper discusses theoretical properties of algorithms for sub-Gaussian distributions and does not specify the use of any named, publicly available datasets for training.
Dataset Splits	No	The paper is theoretical and does not describe training, validation, or test dataset splits for empirical reproduction.
Hardware Specification	No	The paper focuses on theoretical algorithms and their performance guarantees, and does not mention any specific hardware used for experiments.
Software Dependencies	No	The paper does not specify any software dependencies with version numbers.
Experiment Setup	No	The paper does not provide details about experimental setup, such as hyperparameter values or training configurations, as it is a theoretical work.