Support Recovery in Sparse PCA with Incomplete Data
Authors: Hanbyul Lee, Qifan Song, Jean Honorio
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We validate our theoretical results with incomplete synthetic data, and show encouraging and meaningful results on a gene expression dataset. 4 Numerical Results We perform the SDP algorithm of (1) on synthetic and real data to validate our theoretic results and show how well the true support of the sparse principal component is exactly recovered. |
| Researcher Affiliation | Academia | Hanbyul Lee Department of Statistics Purdue University West Lafayette, IN 47906 lee3078@purdue.edu Qifan Song Department of Statistics Purdue University West Lafayette, IN 47906 qfsong@purdue.edu Jean Honorio Department of Computer Science Purdue University West Lafayette, IN 47906 jhonorio@purdue.edu |
| Pseudocode | No | The paper describes mathematical formulations and optimization problems, but does not include any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | We provide the code in the supplemental material. |
| Open Datasets | Yes | We analyze a gene expression dataset (GSE21385) from the Gene Expression Omnibus website (https://www.ncbi.nlm.nih.gov/geo/.) |
| Dataset Splits | No | The paper describes repeating experiments with different random seeds and generating synthetic data, but does not explicitly provide details about training, validation, and test dataset splits for model development or evaluation. |
| Hardware Specification | No | The paper states 'Our experiments were executed on MATLAB and standard CVX code was used' but does not specify any hardware details like GPU/CPU models or memory. The authors also explicitly stated 'No' to the question 'Did you include the total amount of compute and the type of resources used?' in their own analysis. |
| Software Dependencies | No | The paper mentions 'MATLAB and standard CVX code was used', but does not provide specific version numbers for these software components. |
| Experiment Setup | Yes | In this experiment, we fix the spectral gap λ1(M ) λ2(M ) as 20 and the noise parameters B and σ2 as 5 and 0.01. We use the tuning parameter ρ = 0.1. We try three different matrix dimensions d {20, 50, 100} and three different support sizes s {5, 10, 20}. We set B = 5. We try three different spectral gaps λ1(M ) λ2(M ) {10, 30, 50} and three different standard deviations of the normal distribution, σnormal {0.1, 0.3, 0.5}. We try two different tuning parameters ρ {0.1, 0.01} and report the best result. |