Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
A Pairwise Pseudo-likelihood Approach for Matrix Completion with Informative Missingness
Authors: Jiangyuan Li, Jiayi Wang, Raymond K. W. Wong, Kwun Chuen Gary Chan
NeurIPS 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The efficacy of our method is validated via numerical experiments, positioning it as a robust tool for matrix completion to mitigate data bias. |
| Researcher Affiliation | Academia | Jiangyuan Li* Department of Statistics Texas A&M University College Station, TX 77843 EMAIL Jiayi Wang* Department of Mathematical Sciences University of Texas at Dallas Richardson, TX 75080 EMAIL Raymond K. W. Wong Department of Statistics Texas A&M University College Station, TX 77843 EMAIL Kwun Chuen Gary Chan Department of Biostatistics University of Washington Seattle, WA 98195 EMAIL |
| Pseudocode | Yes | Algorithm 1 Projected gradient descent Initialize: Initialize A(0) randomly, set learning rate η. for t = 0 to T do K(t) = A(t) η ℓ(A(t)) Q(t) = Sλ(K(t)) A(t + 1) = POCS(Q(t)) end for Algorithm 2 POCS Initialize: Input matrix Q Rm1 m2. t = 0. while Q = Q do Q = Q Q = Q 1 m1m2 Pm1 i=1 Pm2 j=1 Qi,j J Q i,j = 1 | Qi,j| α Qi,j + 1 | Qi,j| > α sign Qi,j α end while |
| Open Source Code | Yes | The code is publicly available on Git Hub2. 2https://github.com/jiangyuan-li/mc-w-pseudolikelihood |
| Open Datasets | Yes | Tobacco Dataset. This dataset is available in Table 11 in [9]... Coat Shopping Dataset. This dataset is available at https://www.cs.cornell.edu/~schnabts/mnar... Yahoo! Webscope Dataset. This dataset is available at https://webscope.sandbox.yahoo.com/catalog.php?datatype= r&did=3... |
| Dataset Splits | Yes | We use the observed entries as training data and equally split the unobserved data as validation and test data. The validation data is used for hyper-parameter tuning in each method. |
| Hardware Specification | No | The paper mentions "advanced computing resources provided by Texas A&M High Performance Research Computing" in the acknowledgements, but does not specify any particular GPU models, CPU types, or other detailed hardware specifications used for the experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers for replication. |
| Experiment Setup | Yes | Since the objective function is convex in the proposed method, we only tune the regularization parameter λ, and fix the number of iterations as T = 100 and step size η = 1.0 in Algorithm 1. |