Tighten after Relax: Minimax-Optimal Sparse PCA in Polynomial Time

Authors: Zhaoran Wang, Huanran Lu, Han Liu

NeurIPS 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 5 Numerical Results Figure 2 illustrates the main theoretical results. For (a)-(c), we set d=200, s =10, k=5, n=100, and Σ s eigenvalues are {100, 100, 100, 100, 10, 1, . . . , 1}. In detail, (a) illustrates the 1/ t decay of optimization error at the initialization stage; (b) illustrates the decay of the total estimation error (in log-scale) at the main stage; (c) illustrates the basin of attraction phenomenon, as well as the geometric decay of optimization error (in log-scale) of sparse orthogonal iteration pursuit as characterized in 4. For (d) and (e), the eigenstructure is the same, while d, n and s take multiple values. They show that the theoretical p s log d/n statistical rate of our estimator is tight in practice. In Table 1, we compare the subspace error of our procedure with existing methods, where all except our procedure and convex relaxation [13] leverage the deflation method [24] for subspace estimation with k > 1. We consider two settings: Setting (i) is more challenging than setting (ii), since the top k eigenvalues of Σ are not distinct, the eigengap is small and the sample size is smaller. Our procedure significantly outperforms other existing methods on subspace recovery in both settings.
Researcher Affiliation Academia Zhaoran Wang Huanran Lu Han Liu Department of Operations Research and Financial Engineering Princeton University Princeton, NJ 08540 {zhaoran,huanranl,hanliu}@princeton.edu
Pseudocode Yes Algorithm 1 Main stage: Sparse orthogonal Iteration pursuit. ... Algorithm 2 Main stage: The Truncate( , ) function used in Line 7 of Algorithm 1. ... Algorithm 3 Initialization stage: Solving convex relaxation (10) using ADMM.
Open Source Code No The paper does not provide any concrete access information for source code, such as a repository link or an explicit statement about code release.
Open Datasets No Let x1, . . . , xn be independent realizations of X Md(Σ, k, s ) with n nmin, and bΣ be the sample covariance matrix. ... For (a)-(c), we set d=200, s =10, k=5, n=100, and Σ s eigenvalues are {100, 100, 100, 100, 10, 1, . . . , 1}.
Dataset Splits No The paper describes a two-stage procedure (initialization and main stage) and uses simulated data, but does not specify any train/validation/test dataset splits for empirical evaluation in the conventional machine learning sense.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments.
Software Dependencies No The paper does not provide specific ancillary software details, such as library or solver names with version numbers.
Experiment Setup Yes For (a)-(c), we set d=200, s =10, k=5, n=100, and Σ s eigenvalues are {100, 100, 100, 100, 10, 1, . . . , 1}. ... Suppose the regularization parameter ρ = Cλ1 p log d/n for a sufficiently large C > 0 in (10) and the penalty parameter β of ADMM (Line 3 of Algorithm 3) is β = d ρ/k. Also, suppose the sparsity parameter bs in Algorithm 1 (Line 3) is chosen such that bs = C max 4k/(γ 1/2 1)2 , 1 s , where C 1 is an integer constant.