Noise-Tolerant Life-Long Matrix Completion via Adaptive Sampling
Authors: Maria-Florina F. Balcan, Hongyang Zhang
NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our proposed algorithms perform well experimentally in both synthetic and real-world datasets. ... 4 Experimental Results Bounded Deterministic Noise: We verify the estimated error of our algorithm in Theorem 1 under bounded deterministic noise. Our synthetic data are generated as follows. ... Sparse Random Noise: We then verify the exact recoverability of our algorithm under sparse random noise. ... Mixture of Subspaces: To test the performance of our algorithm for the mixture of subspaces, we conduct an experiment on the Hopkins 155 dataset. |
| Researcher Affiliation | Academia | Maria-Florina Balcan Machine Learning Department Carnegie Mellon University, USA ninamf@cs.cmu.edu Hongyang Zhang Machine Learning Department Carnegie Mellon University, USA hongyanz@cs.cmu.edu |
| Pseudocode | Yes | Algorithm 1 Noise-Tolerant Life-Long Matrix Completion under Bounded Deterministic Noise |
| Open Source Code | No | The paper does not provide an explicit statement or link for open-source code for the methodology described. |
| Open Datasets | Yes | Mixture of Subspaces: To test the performance of our algorithm for the mixture of subspaces, we conduct an experiment on the Hopkins 155 dataset. The Hopkins 155 database is composed of 155 matrices/tasks... |
| Dataset Splits | No | The paper does not explicitly state training, validation, and test dataset splits with percentages or sample counts. It mentions "randomly pick 20% entries to be unobserved" but this refers to missing data within samples, not dataset splits. |
| Hardware Specification | No | No specific hardware details (e.g., CPU, GPU models, memory) used for running the experiments are provided in the paper. |
| Software Dependencies | No | The paper does not provide any specific software dependencies or version numbers (e.g., programming languages, libraries, frameworks) used for the experiments. |
| Experiment Setup | Yes | Our synthetic data are generated as follows. We construct 5 base vectors {ui}5 i=1 by sampling their entries from N(0, 1). The underlying matrix L is then generated by L = h u11T 200, P2 i=1 ui1T 200, P3 i=1 ui1T 200, P4 i=1 ui1T 200, P5 i=1 ui1T 1,200 i R100 2,000, each column of which is normalized to the unit ℓ2 norm. Finally, we add bounded yet unstructured noise to each column, with noise level ϵnoise = 0.6. We randomly pick 20% entries to be unobserved. ... The sparse random noise is drawn from standard Gaussian distribution such that s0 d r 1. For each size of problem (50 500 and 100 1, 000), we test with different rank ratios r/m and measurement ratios d/m. The experiment is run by 10 times. |