Residual-Based Sampling for Online Outlier-Robust PCA
Authors: Tianhao Zhu, Jie Shen
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we report some numerical results on synthetic data. Our goal is to illustrate the properties of the online robust algorithm discussed in section 1, and compare our residual-based sampling for online robust PCA algorithm with the algorithm in Feng et al. (2013a)...Simulation results for optimum low rank k = 5 with different number of outliers z = 100, 150, 200 have been shown in Figure 1.Table 2. Comparison for the embedding dimension, marked outliers and execution time of our algorithm and Feng et al. (2013a) when d = 500, k = 5, z = 100. |
| Researcher Affiliation | Academia | Tianhao Zhu 1 Jie Shen 1 1Department of Computer Science, Stevens Institute of Technology, Hoboken, New Jersey, USA. Correspondence to: Tianhao Zhu <tzhu12@stevens.edu>, Jie Shen <jie.shen@stevens.edu>. |
| Pseudocode | Yes | Algorithm 1 Online ORPCA with Logarithmic Approximation Error |
| Open Source Code | No | The paper does not include an explicit statement or a link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | No | The paper states: 'In this section, we report some numerical results on synthetic data. Our goal is to illustrate the properties of the online robust algorithm... To make a fair comparison, we simulate the contaminated data as follows. We randomly generate an d k matrix A...' |
| Dataset Splits | No | The paper describes experiments on 'synthetic data' and presents 'Simulation results' but does not explicitly mention or specify training, validation, or test dataset splits. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware (e.g., GPU models, CPU models, memory) used to run its experiments. |
| Software Dependencies | No | The paper does not provide specific version numbers for any software components or libraries used in the experiments. |
| Experiment Setup | Yes | To make a fair comparison, we simulate the contaminated data as follows. We randomly generate an d k matrix A, and scale it to make its magnitudes of the leading eigenvalues = 2. Then we multiple A with another uniformly generated matrix X = Rk n to make L = AX. A fraction λ of outliers are generated with uniform distribution over [ 20, 20], where z = λn is the number of outliers.Simulation results for optimum low rank k = 5 with different number of outliers z = 100, 150, 200 have been shown in Figure 1. |