reproducibilityindex.ai

Sub-exponential time Sum-of-Squares lower bounds for Principal Components Analysis

Authors: Aaron Potechin, Goutham Rajendran

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	In this work, we study the limits of the powerful Sum of Squares (So S) family of algorithms for Sparse PCA. In this work, we show that So S algorithms cannot beat known spectral algorithms, even if we allow sub-exponential time! Therefore, this suggests that currently used algorithms such as thresholding or other spectral algorithms are in a sense optimal for this problem. Moreover, our techniques are strong enough to obtain similar tradeoffs for Tensor PCA, another important higher order variant of PCA with applications in topic modeling, video processing, etc. Our main contributions can be summarized as follows 1. Despite the huge breakthroughs achieved by Sum-of-Squares algorithms in recent works on high dimensional statistics, we show barriers to it for the fundamental problems of Sparse PCA and Tensor PCA. 3. Prior lower bounds for these problems have either focused on weaker classes of algorithms or were obtained assuming other hardness conjectures, whereas we prove high degree sub-exponential time So S lower bounds without relying on any conjectures. If you are including theoretical results... Did you include complete proofs of all theoretical results? [Yes] See the supplementary. If you ran experiments... [N/A]
Researcher Affiliation	Academia	Aaron Potechin University of Chicago potechin@uchicago.edu Goutham Rajendran University of Chicago goutham@uchicago.edu
Pseudocode	No	The paper focuses on theoretical lower bounds and does not present pseudocode or algorithm blocks.
Open Source Code	No	Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [N/A]
Open Datasets	No	The paper uses theoretical models like the 'Wishart random model of Sparse PCA' and 'Tensor PCA' rather than concrete, publicly available datasets for training. There is no mention of a dataset used for training that is publicly available. 'If you ran experiments... [N/A]'
Dataset Splits	No	The paper is theoretical and does not involve experimental runs with data splits. Therefore, no training/test/validation dataset splits are provided. 'If you ran experiments... [N/A]'
Hardware Specification	No	The paper is theoretical and does not describe any experimental hardware. 'If you ran experiments... [N/A]'
Software Dependencies	No	The paper is theoretical and does not describe any specific software dependencies with version numbers for experimental reproducibility. 'If you ran experiments... [N/A]'
Experiment Setup	No	The paper is theoretical and does not describe any experimental setup details such as hyperparameters or training configurations. 'If you ran experiments... [N/A]'