Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Spectrum Estimation from a Few Entries
Authors: Ashish Khetan, Sewoong Oh
JMLR 2019 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Numerical experiments suggest that we significantly improve upon a competing approach of using matrix completion methods, below the matrix completion threshold, above which matrix completion algorithms recover the underlying low-rank matrix exactly. |
| Researcher Affiliation | Academia | Department of Industrial and Enterprise Systems Engineering University of Illinois Urbana-Champaign Urbana, IL 61801, USA |
| Pseudocode | Yes | Algorithm 1 Schatten k-norm estimator |
| Open Source Code | Yes | A MATLAB implementation of the estimator (3), that includes as its sub-routines the computation of the weights of all k-cyclic pseudographs, is available for download at https://github.com/khetan2/Schatten_norm_estimation. |
| Open Datasets | No | The paper describes generating synthetic data for its numerical experiments, not the use of pre-existing public datasets. For example, in the description of Figure 4, it states: "M is a symmetric positive semi-definite matrix of size d = 500, and rank r = 100 (left panel) and r = 500 (right panel). Singular vectors U of M = UΣU , are generated by QR decomposition of N(0, Id d) and Σi,i is uniformly distributed over [1, 2]." |
| Dataset Splits | No | The paper describes theoretical analysis and numerical experiments primarily involving synthetically generated matrices and a sampling process (Erdos-Renyi sampling of matrix entries). It does not specify traditional training/test/validation dataset splits typically found in empirical machine learning studies for model training and evaluation. |
| Hardware Specification | No | The paper does not provide any specific hardware details (e.g., CPU/GPU models, memory specifications) used for running the experiments. It focuses on theoretical computational complexity (e.g., O(dα)). |
| Software Dependencies | No | The paper mentions "A MATLAB implementation of the estimator (3)" on page 8, but it does not specify a version number for MATLAB or any other critical software libraries or dependencies required to reproduce the experiments. |
| Experiment Setup | Yes | M is a symmetric positive semi-definite matrix of size d = 500, and rank r = 100 (left panel) and r = 500 (right panel). Singular vectors U of M = UΣU , are generated by QR decomposition of N(0, Id d) and Σi,i is uniformly distributed over [1, 2]. For a low rank matrix on the left, there is a clear critical value of p = 0.45, above which matrix completion is exact with high probability. We construct a symmetric matrix M of size d = 1000 and rank r = 200, σi Uni(0, 0.4) for 1 i r/2, and σi Uni(0.6, 1) for r/2 + 1 i r. We estimate br(PΩ(M); c1, c2) for Erd os-R enyi sampling Ω, and a choice of c2 = 0.5 and c1 = 0.6, which is motivated by the distribution of σi. We use Chebyshev polynomial of degree Cb = 2, and s = 1 for qs. |