Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

A Pairwise Pseudo-likelihood Approach for Matrix Completion with Informative Missingness

Authors: Jiangyuan Li, Jiayi Wang, Raymond K. W. Wong, Kwun Chuen Gary Chan

NeurIPS 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The efficacy of our method is validated via numerical experiments, positioning it as a robust tool for matrix completion to mitigate data bias.
Researcher Affiliation	Academia	Jiangyuan Li* Department of Statistics Texas A&M University College Station, TX 77843 EMAIL Jiayi Wang* Department of Mathematical Sciences University of Texas at Dallas Richardson, TX 75080 EMAIL Raymond K. W. Wong Department of Statistics Texas A&M University College Station, TX 77843 EMAIL Kwun Chuen Gary Chan Department of Biostatistics University of Washington Seattle, WA 98195 EMAIL
Pseudocode	Yes	Algorithm 1 Projected gradient descent Initialize: Initialize A(0) randomly, set learning rate η. for t = 0 to T do K(t) = A(t) η ℓ(A(t)) Q(t) = Sλ(K(t)) A(t + 1) = POCS(Q(t)) end for Algorithm 2 POCS Initialize: Input matrix Q Rm1 m2. t = 0. while Q = Q do Q = Q Q = Q 1 m1m2 Pm1 i=1 Pm2 j=1 Qi,j J Q i,j = 1 \| Qi,j\| α Qi,j + 1 \| Qi,j\| > α sign Qi,j α end while
Open Source Code	Yes	The code is publicly available on Git Hub2. 2https://github.com/jiangyuan-li/mc-w-pseudolikelihood
Open Datasets	Yes	Tobacco Dataset. This dataset is available in Table 11 in [9]... Coat Shopping Dataset. This dataset is available at https://www.cs.cornell.edu/~schnabts/mnar... Yahoo! Webscope Dataset. This dataset is available at https://webscope.sandbox.yahoo.com/catalog.php?datatype= r&did=3...
Dataset Splits	Yes	We use the observed entries as training data and equally split the unobserved data as validation and test data. The validation data is used for hyper-parameter tuning in each method.
Hardware Specification	No	The paper mentions "advanced computing resources provided by Texas A&M High Performance Research Computing" in the acknowledgements, but does not specify any particular GPU models, CPU types, or other detailed hardware specifications used for the experiments.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers for replication.
Experiment Setup	Yes	Since the objective function is convex in the proposed method, we only tune the regularization parameter λ, and fix the number of iterations as T = 100 and step size η = 1.0 in Algorithm 1.