Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Link Prediction via Subgraph Embedding-Based Convex Matrix Completion
Authors: Zhu Cao, Linlin Wang, Gerard de Melo
AAAI 2018 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on several datasets show the effectiveness of our method compared to previous work. We extensively evaluate our algorithm across a range of heterogeneous real-world datasets, and also demonstrate its scalability on large networks of up to a million nodes. The experiments show that our methods yield state-of-the-art link prediction results on all evaluated datasets. |
| Researcher Affiliation | Academia | IIIS, Tsinghua University, Beijing, China Rutgers University, New Brunswick, NJ, USA |
| Pseudocode | Yes | Algorithm 1 Overall Algorithm, Algorithm 2 V, C=Vocabulary(G,D), Algorithm 3 PPMI Matrix M= Rep(V, C), Algorithm 4 W=SOFT-IMPUTE(M), Algorithm 6 Rep2Score(M, u, v) |
| Open Source Code | No | The paper does not provide concrete access to source code (specific repository link, explicit code release statement, or code in supplementary materials) for the methodology described in this paper. |
| Open Datasets | Yes | We consider the following datasets: Facebook (Mc Auley and Leskovec 2012; Leskovec and Krevl 2014): We use a Facebook social network dataset... Wikipedia (Leskovec and Krevl 2014): This real-world dataset is collected from Wikipedia... Coauthorship (Leskovec and Krevl 2014): This realworld dataset is formed from the coauthor network of general relativity section on ar Xiv. PPI (Breitkreutz et al. 2008): This protein-protein interaction network... Leskovec, J., and Krevl, A. 2014. SNAP Datasets: Stanford large network dataset collection. http://snap.stanford.edu/ data. |
| Dataset Splits | Yes | For each dataset, the observed edges E are split into two parts ET and EP , where ET is used for training and EP for testing. The splitting is performed with 5-fold cross-validation. That is, the observed edges are split to five equal parts. Then we repeat 5 times, each time take one part as the test set and the rest four parts as the training set. |
| Hardware Specification | Yes | These experiments are run on a laptop with 2.8 GHz CPU and 8G memory. |
| Software Dependencies | No | The paper does not provide specific ancillary software details, such as library names with version numbers, needed to replicate the experiment. |
| Experiment Setup | Yes | For these experiments, the learning rate η and the ratio of regularization λ are optimized to be 0.3 and 0.001, respectively, according to cross validation within the training set. The depth D is taken to be 3 for the datasets FOrig, Wiki, Coauth, and 2 for the dataset PPI. For these tests, we fix the number of inner iterations to an appropriate value (i = 100), which ensures convergence. |