A Stochastic PCA and SVD Algorithm with an Exponential Convergence Rate
Authors: Ohad Shamir
ICML 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 3. Experiments We now turn to present some experiments, which demonstrate the performance of the VR-PCA algorithm. |
| Researcher Affiliation | Academia | Ohad Shamir OHAD.SHAMIR@WEIZMANN.AC.IL Weizmann Institute of Science, Rehovot, Israel |
| Pseudocode | Yes | Algorithm 1 VR-PCA |
| Open Source Code | No | The paper does not contain any statement about making the source code available or provide a link to a code repository. |
| Open Datasets | Yes | Next, we performed a similar experiment using the training data of the well-known MNIST and CCAT datasets. The MNIST data matrix size is 784 70000, and was preprocessed by centering the data and dividing each coordinate by its standard deviation times the squared root of the dimension. The CCAT data matrix is sparse (only 0.16% of entries are non-zero), of size 23149 781265, and was used as-is. |
| Dataset Splits | No | The paper mentions using 'training data' for MNIST and CCAT datasets, but it does not specify any training/validation/test splits (e.g., percentages, sample counts, or specific predefined splits) for reproducibility. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory, or cloud instance types) used for running the experiments. |
| Software Dependencies | No | The paper does not specify any software dependencies with version numbers (e.g., 'Python 3.8', 'PyTorch 1.9') used in the experiments. |
| Experiment Setup | Yes | Rather than tuning its parameters, we used the following fixed heuristic: The epoch length m was set to n (number of data points, or columns in the data matrix), and η was set to η = 1 r n, where r = 1 n Pn i=1 xi 2 is the average squared norm of the data. The choice of m = n ensures that at each epoch, the runtime is about equally divided between the stochastic updates and the computation of u. The choice of η is motivated by our theoretical analysis, which requires η on the order of 1/(maxi xi 2 n) in the regime where m should be on the order of n. All algorithms were initialized from the same random vector, chosen uniformly at random from the unit ball. |