Streaming Sparse Principal Component Analysis
Authors: Wenzhuo Yang, Huan Xu
ICML 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Numerical experiments on synthetic and realworld datasets demonstrate good empirical performance of the proposed algorithms. We investigate the performance of our algorithms on a variety of simulated and real-world datasets. |
| Researcher Affiliation | Academia | Wenzhuo Yang A0096049@NUS.EDU.SG Department of Mechanical Engineering, National University of Singapore, Singapore 117576. Huan Xu MPEXUH@NUS.EDU.SG Department of Mechanical Engineering, National University of Singapore, Singapore 117576. |
| Pseudocode | Yes | Algorithm 1 Row Truncation Operator. Algorithm 2 Streaming SPCA via Row Truncation. Algorithm 3 Streaming SPCA via Iterative Deflation. Algorithm 4 Streaming ECA via Row Truncation. Algorithm 5 Finding Initial Solution. |
| Open Source Code | No | The paper does not provide any explicit statement or link for the open-sourcing of the described methodology's code. |
| Open Datasets | Yes | We use two large datasets, the NIPS paper dataset and the NYTimes news articles dataset, both available from the UCI Machine Learning Repository (Bache & Lichman). |
| Dataset Splits | No | The paper describes generating synthetic data and using real-world datasets with parameters like block size (B) and total samples (n). However, it does not explicitly provide information on standard train/validation/test splits, percentages, or sample counts needed for data partitioning. |
| Hardware Specification | Yes | The experiments are conducted on a desktop PC with an i7 3.4GHz CPU and 4G memory. |
| Software Dependencies | No | All the algorithms mentioned below are implemented in Python. This statement mentions the programming language but does not specify a version number or any other software libraries with their respective versions. |
| Experiment Setup | Yes | Parameters B and γ in streaming sparse PCA are set to 300 and 500, respectively. In the following experiments, the samples are independently drawn from ECp(0, Σ, ξ). Here, Σ is constructed according to Σ = AA + Ip where A is generated following the first scheme described above, and ξ follows the chi-distribution with degree of freedom p or the F-distribution with degrees of freedom p and 1. |