Adaptive Universal Generalized PageRank Graph Neural Network

Authors: Eli Chien, Jianhao Peng, Pan Li, Olgica Milenkovic

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We also compare the performance of our GNN architecture with that of several state-of-the-art GNNs on the problem of node-classification, using well-known benchmark homophilic and heterophilic datasets. The results demonstrate that GPR-GNN offers significant performance improvement compared to existing techniques on both synthetic and benchmark data.
Researcher Affiliation Academia Eli Chien & Jianhao Peng Department of Electrical and Computer Engineering University of Illinois Urbana-Champaign, USA {ichien3,jianhao2}@illinois.edu Pan Li Department of Computer Science Purdue University, USA panli@purdue.edu Olgica Milenkovic Department of Electrical and Computer Engineering University of Illinois Urbana-Champaign, USA milenkov@illinois.edu
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes Our implementation is available online.1 1https://github.com/jianhao2016/GPRGNN
Open Datasets Yes We use 5 homophilic benchmark datasets available from the Pytorch Geometric library, including the citation graphs Cora, Cite Seer, Pub Med (Sen et al., 2008; Yang et al., 2016) and the Amazon co-purchase graphs Computers and Photo (Mc Auley et al., 2015; Shchur et al., 2018). We also use 5 heterophilic benchmark datasets tested in Pei et al. (2019), including Wikipedia graphs Chameleon and Squirrel, the Actor co-occurrence graph, and webpage graphs Texas and Cornell from Web KB3.
Dataset Splits Yes We consider two different choices for the random split into training/validation/test samples, which we call sparse splitting (2.5%/2.5%/95%) and dense splitting (60%/20%/20%), respectively.
Hardware Specification Yes All experiments are performed on a Linux Machine with 48 cores, 376GB of RAM, and a NVIDIA Tesla P100 GPU with 12GB of GPU memory.
Software Dependencies No For all architectures, we use the corresponding Pytorch Geometric library implementations (Fey & Lenssen, 2019). ... All models use the Adam optimizer Kingma & Ba (2014). Specific version numbers for software components like Python or PyTorch are not provided.
Experiment Setup Yes We choose random walk path lengths with K = 10 and use a 2-layer (MLP) with 64 hidden units for the NN component. For the GPR weights, we use different initializations including PPR with α {0.1, 0.2, 0.5, 0.9}, γk = δ0k or δKk and the default random initialization in pytorch. Similarly, for APPNP we search the optimal α within {0.1, 0.2, 0.5, 0.9}. For other hyperparameter tuning, we optimize the learning rate over {0.002, 0.01, 0.05} and weight decay {0.0, 0.0005} for all models. ... We use early stopping 200 and a maximum number of epochs equal to 1000... For GCN, we use 2 GCN layers with 64 hidden units. For GAT, we use 2 GAT convolutional layers, where the first layer has 8 attention heads and each head has 8 hidden units; the second layer has 1 attention head and 64 hidden units. For GCN-Cheby, we use 2 steps propagation for each layer with 32 hidden units. ... For JK-Net, we use the GCN-based model with 2 layers and 16 hidden units in each layer. As for the layer aggregation part, we use a LSTM with 16 channels and 4 layers. For the MLP, we choose a 2-layer fully connected network with 64 hidden units. For APPNP we use the same 2-layer MLP with 10 steps of propagation. Besides the GPR-GNN, we fix the dropout rate for the NN part to be 0.5 as APPNP and optimize the dropout rate for the GPR part among {0, 0.5, 0.7}.