Robust Streaming PCA

Authors: Daniel Bienstock, Minchan Jeong, Apurv Shukla, Se-Young Yun

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, we demonstrate the validity of our analysis through numerical experiments on synthetic and real-world dataset.
Researcher Affiliation Academia Daniel Bienstock IEOR Department Columbia University Minchan Jeong Graduate School of AI Apurv Shukla IEOR Department Columbia University Se-Young Yun Graduate School of AI {dano,as5197}@columbia.edu, {mcjeong,yunseyoung}@kaist.ac.kr
Pseudocode Yes Algorithm 1 Noisy Power Method with block size B [29] 1: Input: Stream of vectors: (xt)T t=1, block size: B, dimensions: p, k...
Open Source Code No The paper does not provide any explicit statement about making its source code available or include a link to a code repository.
Open Datasets Yes We provide the real-world benchmark using the S&P500 stock market dataset [44] in Kaggle to test our findings in the non-stationary environment. ... [44] Paul Mooney. Stock Market Data (NASDAQ, NYSE, S&P500). https://www.kaggle.com/datasets/paultimothymooney/stock-market-data, Version 64.
Dataset Splits No The paper mentions using a 'target space' derived from the final 500 samples for evaluation, but it does not specify typical dataset splits (e.g., training, validation, test percentages or counts) or cross-validation strategies.
Hardware Specification No The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used for running its experiments.
Software Dependencies No The paper does not specify any software dependencies with version numbers (e.g., Python, PyTorch, or specific solvers).
Experiment Setup Yes We identify the optimal block size B, the unique parameter for the noisy power method, and establish an upper bound on the estimation error of the noisy power method. ... The critical difference between the noisy power method and Oja s algorithm is the data used to estimate the principal components. In the noisy power method, the estimates are updated after a batch of observations, whereas in Oja s algorithm, the estimates are updated after scaling every observation with the learning rate. Therefore, the parameters determining the performance of these algorithms are the batch size B for the robust power method and the learning rate for Oja s algorithm.