Efficient Convex Relaxations for Streaming PCA

Authors: Raman Arora, Teodor Vanislavov Marinov

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We include experiments on synthetic data as proof of concept. We also propose more practical variants of MB-MSG and MB-RMSG, which however, do not have theoretical guarantees. Suboptimality is expressed in terms of h P Pt, Ci, where P and C are calculated over a test set. We present plots of total runtime to achieve -suboptimality, and rank of the iterates throughout the iterations. The x-axis of the plots is taken to be on a logarithmic scale. We use the k-SVD routine implemented by Liu et al. (2013).7.1 Synthetic data We generate synthetic data with a large eigengap in the following way. The data is sampled from a multi-variate Normal distribution with zero mean and diagonal covariance matrix . For each value of k, we have i,i = 1 for 1 i k and i,i = gap 2 i 0.1 for k + 1 i d. In our experiments gap = 0.1, k 2 {1, 3, 7} and d = 1000. The empirical results on the synthetic data set can be found in Figure 1.
Researcher Affiliation Academia Raman Arora Dept. of Computer Science Johns Hopkins University Baltimore, MD 21204 arora@cs.jhu.edu Teodor V. Marinov Dept. of Computer Science Johns Hopkins University Baltimore, MD 21204 tmarino2@jhu.edu
Pseudocode Yes Algorithm 1 Mini-batched MSG (MB-MSG) and Algorithm 2 Mini-batched 2-Regularized MSG (MB-RMSG) are presented.
Open Source Code No The paper does not provide any statements or links indicating that the source code for the described methodology is publicly available.
Open Datasets Yes We now present empirical results on the MNIST dataset (Le Cun, 1998) for a more practical variant of algorithms 1 and 2.
Dataset Splits No The paper mentions that "Suboptimality is expressed in terms of h P Pt, Ci, where P and C are calculated over a test set." confirming the use of a test set, but it does not provide explicit details about training, validation, or specific test splits (e.g., percentages, sample counts, or explicit standard split names for all three parts).
Hardware Specification No The paper does not specify any hardware used for running the experiments (e.g., GPU models, CPU specifications, or cloud computing details).
Software Dependencies No The paper mentions using "the k-SVD routine implemented by Liu et al. (2013)" but does not provide specific version numbers for any software dependencies or libraries.
Experiment Setup Yes For each value of k, we have i,i = 1 for 1 i k and i,i = gap 2 i 0.1 for k + 1 i d. In our experiments gap = 0.1, k 2 {1, 3, 7} and d = 1000. ... The update on line 7 is an iteration of SGD on the regularized objective in Equation 4, with λ = (C)/2. ... We did not tune the initial step size for any of the algorithms but rather set step size as recommended by theory.