Support Recovery in Sparse PCA with Incomplete Data

Authors: Hanbyul Lee, Qifan Song, Jean Honorio

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We validate our theoretical results with incomplete synthetic data, and show encouraging and meaningful results on a gene expression dataset. 4 Numerical Results We perform the SDP algorithm of (1) on synthetic and real data to validate our theoretic results and show how well the true support of the sparse principal component is exactly recovered.
Researcher Affiliation Academia Hanbyul Lee Department of Statistics Purdue University West Lafayette, IN 47906 lee3078@purdue.edu Qifan Song Department of Statistics Purdue University West Lafayette, IN 47906 qfsong@purdue.edu Jean Honorio Department of Computer Science Purdue University West Lafayette, IN 47906 jhonorio@purdue.edu
Pseudocode No The paper describes mathematical formulations and optimization problems, but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code Yes We provide the code in the supplemental material.
Open Datasets Yes We analyze a gene expression dataset (GSE21385) from the Gene Expression Omnibus website (https://www.ncbi.nlm.nih.gov/geo/.)
Dataset Splits No The paper describes repeating experiments with different random seeds and generating synthetic data, but does not explicitly provide details about training, validation, and test dataset splits for model development or evaluation.
Hardware Specification No The paper states 'Our experiments were executed on MATLAB and standard CVX code was used' but does not specify any hardware details like GPU/CPU models or memory. The authors also explicitly stated 'No' to the question 'Did you include the total amount of compute and the type of resources used?' in their own analysis.
Software Dependencies No The paper mentions 'MATLAB and standard CVX code was used', but does not provide specific version numbers for these software components.
Experiment Setup Yes In this experiment, we fix the spectral gap λ1(M ) λ2(M ) as 20 and the noise parameters B and σ2 as 5 and 0.01. We use the tuning parameter ρ = 0.1. We try three different matrix dimensions d {20, 50, 100} and three different support sizes s {5, 10, 20}. We set B = 5. We try three different spectral gaps λ1(M ) λ2(M ) {10, 30, 50} and three different standard deviations of the normal distribution, σnormal {0.1, 0.3, 0.5}. We try two different tuning parameters ρ {0.1, 0.01} and report the best result.