Efficient Convex Relaxations for Streaming PCA
Authors: Raman Arora, Teodor Vanislavov Marinov
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We include experiments on synthetic data as proof of concept. We also propose more practical variants of MB-MSG and MB-RMSG, which however, do not have theoretical guarantees. Suboptimality is expressed in terms of h P Pt, Ci, where P and C are calculated over a test set. We present plots of total runtime to achieve -suboptimality, and rank of the iterates throughout the iterations. The x-axis of the plots is taken to be on a logarithmic scale. We use the k-SVD routine implemented by Liu et al. (2013).7.1 Synthetic data We generate synthetic data with a large eigengap in the following way. The data is sampled from a multi-variate Normal distribution with zero mean and diagonal covariance matrix . For each value of k, we have i,i = 1 for 1 i k and i,i = gap 2 i 0.1 for k + 1 i d. In our experiments gap = 0.1, k 2 {1, 3, 7} and d = 1000. The empirical results on the synthetic data set can be found in Figure 1. |
| Researcher Affiliation | Academia | Raman Arora Dept. of Computer Science Johns Hopkins University Baltimore, MD 21204 arora@cs.jhu.edu Teodor V. Marinov Dept. of Computer Science Johns Hopkins University Baltimore, MD 21204 tmarino2@jhu.edu |
| Pseudocode | Yes | Algorithm 1 Mini-batched MSG (MB-MSG) and Algorithm 2 Mini-batched 2-Regularized MSG (MB-RMSG) are presented. |
| Open Source Code | No | The paper does not provide any statements or links indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | We now present empirical results on the MNIST dataset (Le Cun, 1998) for a more practical variant of algorithms 1 and 2. |
| Dataset Splits | No | The paper mentions that "Suboptimality is expressed in terms of h P Pt, Ci, where P and C are calculated over a test set." confirming the use of a test set, but it does not provide explicit details about training, validation, or specific test splits (e.g., percentages, sample counts, or explicit standard split names for all three parts). |
| Hardware Specification | No | The paper does not specify any hardware used for running the experiments (e.g., GPU models, CPU specifications, or cloud computing details). |
| Software Dependencies | No | The paper mentions using "the k-SVD routine implemented by Liu et al. (2013)" but does not provide specific version numbers for any software dependencies or libraries. |
| Experiment Setup | Yes | For each value of k, we have i,i = 1 for 1 i k and i,i = gap 2 i 0.1 for k + 1 i d. In our experiments gap = 0.1, k 2 {1, 3, 7} and d = 1000. ... The update on line 7 is an iteration of SGD on the regularized objective in Equation 4, with λ = (C)/2. ... We did not tune the initial step size for any of the algorithms but rather set step size as recommended by theory. |