Residual-Based Sampling for Online Outlier-Robust PCA

Authors: Tianhao Zhu, Jie Shen

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we report some numerical results on synthetic data. Our goal is to illustrate the properties of the online robust algorithm discussed in section 1, and compare our residual-based sampling for online robust PCA algorithm with the algorithm in Feng et al. (2013a)...Simulation results for optimum low rank k = 5 with different number of outliers z = 100, 150, 200 have been shown in Figure 1.Table 2. Comparison for the embedding dimension, marked outliers and execution time of our algorithm and Feng et al. (2013a) when d = 500, k = 5, z = 100.
Researcher Affiliation Academia Tianhao Zhu 1 Jie Shen 1 1Department of Computer Science, Stevens Institute of Technology, Hoboken, New Jersey, USA. Correspondence to: Tianhao Zhu <tzhu12@stevens.edu>, Jie Shen <jie.shen@stevens.edu>.
Pseudocode Yes Algorithm 1 Online ORPCA with Logarithmic Approximation Error
Open Source Code No The paper does not include an explicit statement or a link indicating that the source code for the described methodology is publicly available.
Open Datasets No The paper states: 'In this section, we report some numerical results on synthetic data. Our goal is to illustrate the properties of the online robust algorithm... To make a fair comparison, we simulate the contaminated data as follows. We randomly generate an d k matrix A...'
Dataset Splits No The paper describes experiments on 'synthetic data' and presents 'Simulation results' but does not explicitly mention or specify training, validation, or test dataset splits.
Hardware Specification No The paper does not explicitly describe the specific hardware (e.g., GPU models, CPU models, memory) used to run its experiments.
Software Dependencies No The paper does not provide specific version numbers for any software components or libraries used in the experiments.
Experiment Setup Yes To make a fair comparison, we simulate the contaminated data as follows. We randomly generate an d k matrix A, and scale it to make its magnitudes of the leading eigenvalues = 2. Then we multiple A with another uniformly generated matrix X = Rk n to make L = AX. A fraction λ of outliers are generated with uniform distribution over [ 20, 20], where z = λn is the number of outliers.Simulation results for optimum low rank k = 5 with different number of outliers z = 100, 150, 200 have been shown in Figure 1.