Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Matrix Completion with Quantified Uncertainty through Low Rank Gaussian Copula

Authors: Yuxuan Zhao, Madeleine Udell

NeurIPS 2020 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirical results show the method yields state-of-the-art imputation accuracy across a wide range of data types, including those with high rank.
Researcher Affiliation Academia Yuxuan Zhao Cornell University EMAIL Madeleine Udell Cornell University EMAIL
Pseudocode Yes Algorithm 1 Imputation via low rank Gaussian copula fitting
Open Source Code No The paper does not provide an explicit statement about releasing code for the methodology or a link to a code repository.
Open Datasets Yes Movie Lens 1M dataset [20]
Dataset Splits Yes We use 80% of observation as training set, 10% as validation set, and 10% as test set, repeated 5 times.
Hardware Specification Yes On a laptop with Intel-i5-3.1GHz Core and 8 GB RAM
Software Dependencies No The paper mentions software like "R" and "julia" but does not provide specific version numbers for any software components.
Experiment Setup Yes We set n = 500 and p = 200. For continuous data, we use gj(z) = z to generate a low rank X = Z and gj(z) = z3 to generate a high rank X. We set k = 10, Οƒ2 = 0.1 and the missing ratio as 40%. For 1-5 ordinal data and binary data, we use step functions gj with random selected cut points. We generate one X with high SNR Οƒ2 = 0.1 and one X with low SNR Οƒ2 = 0.5. We set k = 5 and the missing ratio as 60%. All experiments are repeated 20 times. ... LRGC (rank 10) takes 38 mins in R, soft Impute (rank 201) takes 93 mins in R, and GLRM-Bv S (rank 200) takes 25 mins in julia.