Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Stochastic PCA with $\ell_2$ and $\ell_1$ Regularization

Authors: Poorya Mianjy, Raman Arora

ICML 2018 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We provide empirical results for our proposed algorithms โ„“2-RMSG, โ„“1-RMSG, and โ„“2,1-RMSG, compared to vanilla MSG, Oja s algorithm, and Follow The Leader (FTL) algorithm, on both synthetic and real datasets. The synthetic data is drawn from a d = 100 dimensional zero-mean multivariate Gaussian distribution with an exponential decay in the spectrum of the covariance matrix. The synthetic consists of n = 30K samples, out of which 20K samples are used for training and 5K each for tuning and testing. For comparisons on a real dataset, we choose MNIST which consists of n = 60K samples each of size d = 784.
Researcher Affiliation Academia 1Department of Computer Science, Johns Hopkins University, Baltimore, USA. Correspondence to: Raman Arora <EMAIL>.
Pseudocode Yes Algorithm 1 โ„“2-Regularized MSG (โ„“2-RMSG); Algorithm 2 โ„“1-Regularized MSG (โ„“1-RMSG); Algorithm 3 โ„“2 + โ„“1-Regularized MSG (โ„“2,1-RMSG)
Open Source Code No The paper does not provide any links to source code or explicitly state that the code is publicly available.
Open Datasets Yes For comparisons on a real dataset, we choose MNIST which consists of n = 60K samples each of size d = 784.
Dataset Splits Yes The synthetic consists of n = 30K samples, out of which 20K samples are used for training and 5K each for tuning and testing.
Hardware Specification No The paper states 'The runtime is captured in a controlled setting each run for every algorithm was on a dedicated identical compute node.' but does not provide specific hardware details like CPU or GPU models.
Software Dependencies No The paper does not specify any software dependencies with version numbers.
Experiment Setup Yes For MSG and โ„“1-RMSG, the learning rate is set to ฮท0/โˆšt, and for โ„“2-RMSG, โ„“2,1-RMSG and Oja the learning rate was set to ฮท0/t as suggested by theory. We choose ฮท0 (initial learning rate), ฮป and ยต by tuning2 each over the set {10โˆ’3, 10โˆ’2, 10โˆ’1, 1, 10, 102, 103} on held-out data, for k = 40.