Nearly Optimal Robust Matrix Completion

Authors: Yeshwanth Cherapanamjeri, Kartik Gupta, Prateek Jain

ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our empirical results corroborate our theoretical results and show that even for moderate sized problems, our method for robust PCA is an order of magnitude faster than the existing methods. In this section we discuss the performance of Algorithm 1 on synthetic data and its use in foreground background separation.
Researcher Affiliation Industry 1Microsoft Research India. Correspondence to: Prateek Jain <prajain@microsoft.com>.
Pseudocode Yes Algorithm 1 b L = PG-RMC (Ω, PΩ(M), ϵ, r, µ, η, σ) and Algorithm 2 {Ω1, . . . , ΩT } = Split Samples(Ω, p, T)
Open Source Code No The paper does not provide any explicit statements about releasing source code for the methodology or links to a code repository.
Open Datasets No The paper mentions generating 'Synthetic data' for experiments and applying the method to 'shopping center video' and 'restaurant video' for foreground-background separation, but it does not provide concrete access information (links, DOIs, repositories, or citations with author/year) for these datasets.
Dataset Splits No The paper discusses error metrics and iterative processes but does not specify any dataset splits (e.g., percentages for training, validation, or test sets) or cross-validation methodology for reproducing the data partitioning.
Hardware Specification No The paper mentions that the algorithm was 'implemented in MATLAB' but provides no specific details about the hardware (e.g., GPU/CPU models, memory) used for conducting the experiments.
Software Dependencies No The paper states 'We implemented our algorithm in MATLAB' but does not specify the version of MATLAB or any other software dependencies with their respective version numbers.
Experiment Setup Yes The algorithm has three main parameters: 1) threshold ζ, 2) incoherence µ and 3) sampling probability p (E[|Ω|] = p mn). In the experiments on synthetic data we observed that keeping ζ µ M S(t) 2 / n speeds up the recovery while for background extraction keeping ζ µ M S(t) 2 /n gives a better quality output. The value of µ for real world data sets was figured out using cross validation while for the synthetic data the same value was used as used in data generation. The sampling probability for the synthetic data could be kept as low as (θ =)2r log2(n)/n while for the real world data set we get good results for p = 0.05. We define effective sample size as the ratio between the sampling probability and θ.