Sum-of-Squares Lower Bounds for Sparse PCA

Authors: Tengyu Ma, Avi Wigderson

NeurIPS 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical This paper establishes a statistical versus computational trade-off for solving a basic high-dimensional machine learning problem via a basic convex relaxation method. Specifically, we consider the Sparse Principal Component Analysis (Sparse PCA) problem, and the family of Sum-of-Squares (So S, aka Lasserre/Parillo) convex relaxations. It was well known that in large dimension p, a planted k-sparse unit vector can be in principle detected using only n k log p (Gaussian or Bernoulli) samples, but all efficient (polynomial time) algorithms known require n k2 samples. It was also known that this quadratic gap cannot be improved by the the most basic semi-definite (SDP, aka spectral) relaxation, equivalent to a degree-2 So S algorithms. Here we prove that also degree-4 So S algorithms cannot improve this quadratic gap. This average-case lower bound adds to the small collection of hardness results in machine learning for this powerful family of convex relaxation algorithms.
Researcher Affiliation Academia Tengyu Ma 1 and Avi Wigderson 2 1Department of Computer Science, Princeton University 2School of Mathematics, Institute for Advanced Study
Pseudocode Yes Algorithm 1 So S4(ˆΣ): Degree-4 Sum of Squares Relaxation Solve the following SDP and obtain optimal objective value So S4(ˆΣ) and maximizer M . Variables: M(S), for all mutli-sets S of size at most 4. So S4(ˆΣ) = max X i,j M(xixj)ˆΣij (Obj) subject to X i [p] M(x2 i ) = k and X i,j [p] |M(xixj)| k2 (C1&2) M(x3 i xj) = M(xixj), and X ℓ [p] M(x2 ℓxixj) = k M(xixj), i, j [p] (C4) i,j,s,t [p] |M(xixjxsxt)| k4 and M 0 (C5&6) Output: 1. For detection problem : output Hv if So S4(ˆΣ) > (1 + 1 2λ)k, H0 otherwise 2. For estimation problem: output M 2 = (M (xixj))i,j [p]
Open Source Code No No statement regarding the release of source code was found. The paper only mentions 'A complete paper is available as supplementary material or on arxiv.'
Open Datasets No The paper describes a theoretical model for data generation rather than using or referring to a publicly available dataset.
Dataset Splits No The paper is theoretical and does not describe experimental data splits.
Hardware Specification No The paper is theoretical and does not mention any specific hardware used for computations.
Software Dependencies No The paper describes mathematical frameworks and algorithms but does not mention specific software dependencies with version numbers.
Experiment Setup No The paper is theoretical and does not describe an experimental setup with hyperparameters or training configurations.