Graph Neural Networks with Heterophily

Authors: Jiong Zhu, Ryan A. Rossi, Anup Rao, Tung Mai, Nedim Lipka, Nesreen K. Ahmed, Danai Koutra11168-11176

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our extensive experiments demonstrate the effectiveness of our approach in more realistic and challenging experimental settings with significantly less training data compared to previous works: CPGNN variants achieve state-of-the-art results in heterophily settings with or without contextual node features, while maintaining comparable performance in homophily settings.
Researcher Affiliation Collaboration 1University of Michigan, Ann Arbor, USA 2Adobe Research, San Jose, USA 3Intel Labs, Santa Clara, USA
Pseudocode No The paper does not include a dedicated section for pseudocode or a clearly labeled algorithm block.
Open Source Code Yes We release CPGNN at https://github.com/GemsLab/CPGNN.
Open Datasets Yes We assign to the nodes feature vectors from the recently announced Open Graph Benchmark (Hu et al. 2020), which includes only graphs with homophily. ... For real-world graph data, we consider graphs with heterophily and homophily. We use 3 heterophilous graphs, namely Texas, Squirrel and Chameleon (Rozemberczki, Allen, and Sarkar 2019), and 3 widely adopted graphs with strong homophily, which are Cora, Pubmed and Citeseer (Sen et al. 2008; Namata et al. 2012). We use the features and class labels provided by Pei et al. (2020).
Dataset Splits Yes For synthetic experiments, we generate 3 synthetic graphs for every heterophily level h ∈ {0, 0.1, 0.2, . . . , 0.9, 1}. We then randomly select 10% of nodes in each class for training, 10% for validation, and 80% for testing, and report the average classification accuracy as performance of each model on all instances with the same level of heterophily. ... On real-world graphs, we generate 10 random splits for training, validation and test sets; for each split we randomly select 10% of nodes in each class to form the training set, with another 10% for the validation set and the remaining as the test set.
Hardware Specification Yes We gratefully acknowledge the support of NVIDIA Corporation with the donation of the Quadro P6000 GPU used for this research.
Software Dependencies Yes All models are trained with Python 3.8.5, PyTorch 1.8.0 with CUDA 11.1 and cuDNN 8005. We use Optuna for hyperparameter tuning.
Experiment Setup Yes For all models and datasets, we perform Bayesian Optimization on learning rates in the range [10^−4, 10^−2], weight decay in the range [10^−6, 10^−4], and dropout rates in the range [0.1, 0.5].