Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Stochastic PCA with $\ell_2$ and $\ell_1$ Regularization
Authors: Poorya Mianjy, Raman Arora
ICML 2018 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We provide empirical results for our proposed algorithms โ2-RMSG, โ1-RMSG, and โ2,1-RMSG, compared to vanilla MSG, Oja s algorithm, and Follow The Leader (FTL) algorithm, on both synthetic and real datasets. The synthetic data is drawn from a d = 100 dimensional zero-mean multivariate Gaussian distribution with an exponential decay in the spectrum of the covariance matrix. The synthetic consists of n = 30K samples, out of which 20K samples are used for training and 5K each for tuning and testing. For comparisons on a real dataset, we choose MNIST which consists of n = 60K samples each of size d = 784. |
| Researcher Affiliation | Academia | 1Department of Computer Science, Johns Hopkins University, Baltimore, USA. Correspondence to: Raman Arora <EMAIL>. |
| Pseudocode | Yes | Algorithm 1 โ2-Regularized MSG (โ2-RMSG); Algorithm 2 โ1-Regularized MSG (โ1-RMSG); Algorithm 3 โ2 + โ1-Regularized MSG (โ2,1-RMSG) |
| Open Source Code | No | The paper does not provide any links to source code or explicitly state that the code is publicly available. |
| Open Datasets | Yes | For comparisons on a real dataset, we choose MNIST which consists of n = 60K samples each of size d = 784. |
| Dataset Splits | Yes | The synthetic consists of n = 30K samples, out of which 20K samples are used for training and 5K each for tuning and testing. |
| Hardware Specification | No | The paper states 'The runtime is captured in a controlled setting each run for every algorithm was on a dedicated identical compute node.' but does not provide specific hardware details like CPU or GPU models. |
| Software Dependencies | No | The paper does not specify any software dependencies with version numbers. |
| Experiment Setup | Yes | For MSG and โ1-RMSG, the learning rate is set to ฮท0/โt, and for โ2-RMSG, โ2,1-RMSG and Oja the learning rate was set to ฮท0/t as suggested by theory. We choose ฮท0 (initial learning rate), ฮป and ยต by tuning2 each over the set {10โ3, 10โ2, 10โ1, 1, 10, 102, 103} on held-out data, for k = 40. |