Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Noise-Tolerant Life-Long Matrix Completion via Adaptive Sampling
Authors: Maria-Florina F. Balcan, Hongyang Zhang
NeurIPS 2016 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our proposed algorithms perform well experimentally in both synthetic and real-world datasets. ... 4 Experimental Results Bounded Deterministic Noise: We verify the estimated error of our algorithm in Theorem 1 under bounded deterministic noise. Our synthetic data are generated as follows. ... Sparse Random Noise: We then verify the exact recoverability of our algorithm under sparse random noise. ... Mixture of Subspaces: To test the performance of our algorithm for the mixture of subspaces, we conduct an experiment on the Hopkins 155 dataset. |
| Researcher Affiliation | Academia | Maria-Florina Balcan Machine Learning Department Carnegie Mellon University, USA EMAIL Hongyang Zhang Machine Learning Department Carnegie Mellon University, USA EMAIL |
| Pseudocode | Yes | Algorithm 1 Noise-Tolerant Life-Long Matrix Completion under Bounded Deterministic Noise |
| Open Source Code | No | The paper does not provide an explicit statement or link for open-source code for the methodology described. |
| Open Datasets | Yes | Mixture of Subspaces: To test the performance of our algorithm for the mixture of subspaces, we conduct an experiment on the Hopkins 155 dataset. The Hopkins 155 database is composed of 155 matrices/tasks... |
| Dataset Splits | No | The paper does not explicitly state training, validation, and test dataset splits with percentages or sample counts. It mentions "randomly pick 20% entries to be unobserved" but this refers to missing data within samples, not dataset splits. |
| Hardware Specification | No | No specific hardware details (e.g., CPU, GPU models, memory) used for running the experiments are provided in the paper. |
| Software Dependencies | No | The paper does not provide any specific software dependencies or version numbers (e.g., programming languages, libraries, frameworks) used for the experiments. |
| Experiment Setup | Yes | Our synthetic data are generated as follows. We construct 5 base vectors {ui}5 i=1 by sampling their entries from N(0, 1). The underlying matrix L is then generated by L = h u11T 200, P2 i=1 ui1T 200, P3 i=1 ui1T 200, P4 i=1 ui1T 200, P5 i=1 ui1T 1,200 i R100 2,000, each column of which is normalized to the unit β2 norm. Finally, we add bounded yet unstructured noise to each column, with noise level Ο΅noise = 0.6. We randomly pick 20% entries to be unobserved. ... The sparse random noise is drawn from standard Gaussian distribution such that s0 d r 1. For each size of problem (50 500 and 100 1, 000), we test with different rank ratios r/m and measurement ratios d/m. The experiment is run by 10 times. |