Link Prediction via Subgraph Embedding-Based Convex Matrix Completion

Authors: Zhu Cao, Linlin Wang, Gerard de Melo

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on several datasets show the effectiveness of our method compared to previous work. We extensively evaluate our algorithm across a range of heterogeneous real-world datasets, and also demonstrate its scalability on large networks of up to a million nodes. The experiments show that our methods yield state-of-the-art link prediction results on all evaluated datasets.
Researcher Affiliation Academia IIIS, Tsinghua University, Beijing, China Rutgers University, New Brunswick, NJ, USA
Pseudocode Yes Algorithm 1 Overall Algorithm, Algorithm 2 V, C=Vocabulary(G,D), Algorithm 3 PPMI Matrix M= Rep(V, C), Algorithm 4 W=SOFT-IMPUTE(M), Algorithm 6 Rep2Score(M, u, v)
Open Source Code No The paper does not provide concrete access to source code (specific repository link, explicit code release statement, or code in supplementary materials) for the methodology described in this paper.
Open Datasets Yes We consider the following datasets: Facebook (Mc Auley and Leskovec 2012; Leskovec and Krevl 2014): We use a Facebook social network dataset... Wikipedia (Leskovec and Krevl 2014): This real-world dataset is collected from Wikipedia... Coauthorship (Leskovec and Krevl 2014): This realworld dataset is formed from the coauthor network of general relativity section on ar Xiv. PPI (Breitkreutz et al. 2008): This protein-protein interaction network... Leskovec, J., and Krevl, A. 2014. SNAP Datasets: Stanford large network dataset collection. http://snap.stanford.edu/ data.
Dataset Splits Yes For each dataset, the observed edges E are split into two parts ET and EP , where ET is used for training and EP for testing. The splitting is performed with 5-fold cross-validation. That is, the observed edges are split to five equal parts. Then we repeat 5 times, each time take one part as the test set and the rest four parts as the training set.
Hardware Specification Yes These experiments are run on a laptop with 2.8 GHz CPU and 8G memory.
Software Dependencies No The paper does not provide specific ancillary software details, such as library names with version numbers, needed to replicate the experiment.
Experiment Setup Yes For these experiments, the learning rate η and the ratio of regularization λ are optimized to be 0.3 and 0.001, respectively, according to cross validation within the training set. The depth D is taken to be 3 for the datasets FOrig, Wiki, Coauth, and 2 for the dataset PPI. For these tests, we fix the number of inner iterations to an appropriate value (i = 100), which ensures convergence.