Neural Link Prediction with Walk Pooling

Authors: Liming Pan, Cheng Shi, Ivan Dokmanić

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental It outperforms state-of-the-art methods on all common link prediction benchmarks, both homophilic and heterophilic, with and without node attributes. Applying Walk Pool to a set of unsupervised GNNs significantly improves prediction accuracy, suggesting that it may be used as a general-purpose graph pooling scheme. We experiment with eight datasets without node attributes and seven with attributes. We perform 10 random splits of the data. The average AUCs with standard deviations are shown in Table 2;
Researcher Affiliation Academia Liming Pan School of Computer and Electronic Information, Nanjing Normal University, 210023 Nanjing, China. pan.liming@njnu.edu.cn Cheng Shi* & Ivan Dokmani c Departement Mathematik und Informatik, Universit at Basel, 4051 Basel, Switzerland. {firstname.lastname}@unibas.ch
Pseudocode No No formal pseudocode or algorithm blocks (e.g., Algorithm 1) are present, only descriptive text.
Open Source Code Yes Our code and data are available online at https://github.com/DaDaCheng/WalkPooling.
Open Datasets Yes We experiment with eight datasets without node attributes and seven with attributes. As graphs without attributes we use: (i) USAir (Batagelj & Mrvar, 2006), (ii) NS (Newman, 2006), (iii) PB (Ackland et al., 2005), (iv) Yeast (Von Mering et al., 2002), (v) C.ele (Watts & Strogatz, 1998), (vi) Power (Watts & Strogatz, 1998), (vii) Router (Spring et al., 2002), and (viii) E.coli (Zhang et al., 2018). As graphs with node attributes, we use: (i) Cora (Mc Callum et al., 2000), (ii) Citeseer (Giles et al., 1998), (iii) Pubmed (Namata et al., 2012), (iv) Chameleon (Rozemberczki et al., 2021), (v) Cornell (Craven et al., 1998), (vi) Texas (Craven et al., 1998), and (vii) Wisconsin (Craven et al., 1998).
Dataset Splits Yes Following the experimental protocols in (Kipf & Welling, 2016b; Pan et al., 2018; Mavromatis & Karypis, 2020), we split the links in three parts: 10% testing, 5% validation, 85% training. For the datasets without node attributes, 90% of edges are taken as training positive edges and the remaining 10% are for testing positive edges. The corresponding same number of negative edges are sampled randomly for training and testing. Then among the training edges. we randomly selected 5% for validation.
Hardware Specification No No specific hardware details (e.g., GPU/CPU models, memory, cloud instances) are provided for the experimental setup.
Software Dependencies No No specific software dependencies with version numbers (e.g., PyTorch 1.9, Python 3.8) are mentioned.
Experiment Setup Yes The hyperparamters to reproduce the results are summarized in Table 6. Name With attributes No attributes optimizer Adam Adam learning rate 5e-5 5e-5 weight decay 0 0 test ratio 0.1 0.1 validation ratio 0.05 of training edges 0.05 of all edges batch size 32 32 epochs 50 50 hops of enclosing subgraph ( ) 2 2 dimension of initial representation Z(0) 16 32 initial representation Z(0) ones or DL unsupervised models hidden layers of GCN 32 32 output layers of GCN 32 32 hidden layers of attention MLP 32 32 output layers of attention MLP 32 32 walk length cutoffτc 7 7 heads 2 2