Sum-of-Squares Lower Bounds for Sparse PCA
Authors: Tengyu Ma, Avi Wigderson
NeurIPS 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | This paper establishes a statistical versus computational trade-off for solving a basic high-dimensional machine learning problem via a basic convex relaxation method. Specifically, we consider the Sparse Principal Component Analysis (Sparse PCA) problem, and the family of Sum-of-Squares (So S, aka Lasserre/Parillo) convex relaxations. It was well known that in large dimension p, a planted k-sparse unit vector can be in principle detected using only n k log p (Gaussian or Bernoulli) samples, but all efficient (polynomial time) algorithms known require n k2 samples. It was also known that this quadratic gap cannot be improved by the the most basic semi-definite (SDP, aka spectral) relaxation, equivalent to a degree-2 So S algorithms. Here we prove that also degree-4 So S algorithms cannot improve this quadratic gap. This average-case lower bound adds to the small collection of hardness results in machine learning for this powerful family of convex relaxation algorithms. |
| Researcher Affiliation | Academia | Tengyu Ma 1 and Avi Wigderson 2 1Department of Computer Science, Princeton University 2School of Mathematics, Institute for Advanced Study |
| Pseudocode | Yes | Algorithm 1 So S4(ˆΣ): Degree-4 Sum of Squares Relaxation Solve the following SDP and obtain optimal objective value So S4(ˆΣ) and maximizer M . Variables: M(S), for all mutli-sets S of size at most 4. So S4(ˆΣ) = max X i,j M(xixj)ˆΣij (Obj) subject to X i [p] M(x2 i ) = k and X i,j [p] |M(xixj)| k2 (C1&2) M(x3 i xj) = M(xixj), and X ℓ [p] M(x2 ℓxixj) = k M(xixj), i, j [p] (C4) i,j,s,t [p] |M(xixjxsxt)| k4 and M 0 (C5&6) Output: 1. For detection problem : output Hv if So S4(ˆΣ) > (1 + 1 2λ)k, H0 otherwise 2. For estimation problem: output M 2 = (M (xixj))i,j [p] |
| Open Source Code | No | No statement regarding the release of source code was found. The paper only mentions 'A complete paper is available as supplementary material or on arxiv.' |
| Open Datasets | No | The paper describes a theoretical model for data generation rather than using or referring to a publicly available dataset. |
| Dataset Splits | No | The paper is theoretical and does not describe experimental data splits. |
| Hardware Specification | No | The paper is theoretical and does not mention any specific hardware used for computations. |
| Software Dependencies | No | The paper describes mathematical frameworks and algorithms but does not mention specific software dependencies with version numbers. |
| Experiment Setup | No | The paper is theoretical and does not describe an experimental setup with hyperparameters or training configurations. |