Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Efficient Convex Relaxations for Streaming PCA
Authors: Raman Arora, Teodor Vanislavov Marinov
NeurIPS 2019 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We include experiments on synthetic data as proof of concept. We also propose more practical variants of MB-MSG and MB-RMSG, which however, do not have theoretical guarantees. Suboptimality is expressed in terms of h P Pt, Ci, where P and C are calculated over a test set. We present plots of total runtime to achieve -suboptimality, and rank of the iterates throughout the iterations. The x-axis of the plots is taken to be on a logarithmic scale. We use the k-SVD routine implemented by Liu et al. (2013).7.1 Synthetic data We generate synthetic data with a large eigengap in the following way. The data is sampled from a multi-variate Normal distribution with zero mean and diagonal covariance matrix . For each value of k, we have i,i = 1 for 1 i k and i,i = gap 2 i 0.1 for k + 1 i d. In our experiments gap = 0.1, k 2 {1, 3, 7} and d = 1000. The empirical results on the synthetic data set can be found in Figure 1. |
| Researcher Affiliation | Academia | Raman Arora Dept. of Computer Science Johns Hopkins University Baltimore, MD 21204 EMAIL Teodor V. Marinov Dept. of Computer Science Johns Hopkins University Baltimore, MD 21204 EMAIL |
| Pseudocode | Yes | Algorithm 1 Mini-batched MSG (MB-MSG) and Algorithm 2 Mini-batched 2-Regularized MSG (MB-RMSG) are presented. |
| Open Source Code | No | The paper does not provide any statements or links indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | We now present empirical results on the MNIST dataset (Le Cun, 1998) for a more practical variant of algorithms 1 and 2. |
| Dataset Splits | No | The paper mentions that "Suboptimality is expressed in terms of h P Pt, Ci, where P and C are calculated over a test set." confirming the use of a test set, but it does not provide explicit details about training, validation, or specific test splits (e.g., percentages, sample counts, or explicit standard split names for all three parts). |
| Hardware Specification | No | The paper does not specify any hardware used for running the experiments (e.g., GPU models, CPU specifications, or cloud computing details). |
| Software Dependencies | No | The paper mentions using "the k-SVD routine implemented by Liu et al. (2013)" but does not provide specific version numbers for any software dependencies or libraries. |
| Experiment Setup | Yes | For each value of k, we have i,i = 1 for 1 i k and i,i = gap 2 i 0.1 for k + 1 i d. In our experiments gap = 0.1, k 2 {1, 3, 7} and d = 1000. ... The update on line 7 is an iteration of SGD on the regularized objective in Equation 4, with λ = (C)/2. ... We did not tune the initial step size for any of the algorithms but rather set step size as recommended by theory. |