Revisiting Link Prediction: a data perspective

Authors: Haitao Mao, Juanhui Li, Harry Shomer, Bingheng Li, Wenqi Fan, Yao Ma, Tong Zhao, Neil Shah, Jiliang Tang

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct analysis on CORA, CITESEER, PUBMED, OGBL-COLLAB, OGBL-PPA, and OGBL-DDI datasets (Hu et al., 2020; Mc Callum et al., 2000) with the same model setting as recent benchmark (Li et al., 2023). Experimental and dataset details are in Appendix K and J, respectively.
Researcher Affiliation Collaboration Haitao Mao1 , Juanhui Li1, Harry Shomer1, Bingheng Li2, Wenqi Fan3, Yao Ma4, Tong Zhao5, Neil Shah5 and Jiliang Tang1 1Michigan State University 2 The Chinese University of Hong Kong, Shenzhen 3 Hong Kong Polytechnic University 4 Rensselaer Polytechnic Institute 5Snap Inc.
Pseudocode No The paper does not contain any clearly labeled 'Pseudocode' or 'Algorithm' blocks, nor structured steps formatted like code or an algorithm.
Open Source Code Yes Code is available at here.
Open Datasets Yes We conduct analysis on CORA, CITESEER, PUBMED, OGBL-COLLAB, OGBL-PPA, and OGBL-DDI datasets (Hu et al., 2020; Mc Callum et al., 2000) with the same model setting as recent benchmark (Li et al., 2023).
Dataset Splits Yes For the data split, we adapt the fixed split with percentages 85/5/10% for planetoid datasets, which can be found at https://github.com/Juanhui28/HeaRT. For OGB datasets, we use the fixed splits provided by the OGB benchmark (Hu et al., 2020).
Hardware Specification Yes The experiments are performed on one Linux server (CPU: Intel(R) Xeon(R) CPU E5-2690 v4 @2.60GHz, Operation system: Ubuntu 16.04.6 LTS). For GPU resources, eight NVIDIA Tesla V100 cards are utilized
Software Dependencies Yes The Python libraries we use to implement our experiments are Py Torch 1.12.1 and Py G 2.1.0.post1.
Experiment Setup Yes Training Settings. The binary cross entropy loss and Adam optimizer (Kingma & Ba, 2014) are utilized for training. For each training positive sample, we randomly select one negative sample for training. Each model is trained with a maximum of 9999 epochs with the early stop training strategy. We set the early stop epoch to 50 and 20 for planetoid and OGB datasets, respectively. Hyperparameter Settings. For deep models, the hyparameter searching range is shown in Table 10.