Graph Neural Networks with Heterophily
Authors: Jiong Zhu, Ryan A. Rossi, Anup Rao, Tung Mai, Nedim Lipka, Nesreen K. Ahmed, Danai Koutra11168-11176
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our extensive experiments demonstrate the effectiveness of our approach in more realistic and challenging experimental settings with significantly less training data compared to previous works: CPGNN variants achieve state-of-the-art results in heterophily settings with or without contextual node features, while maintaining comparable performance in homophily settings. |
| Researcher Affiliation | Collaboration | 1University of Michigan, Ann Arbor, USA 2Adobe Research, San Jose, USA 3Intel Labs, Santa Clara, USA |
| Pseudocode | No | The paper does not include a dedicated section for pseudocode or a clearly labeled algorithm block. |
| Open Source Code | Yes | We release CPGNN at https://github.com/GemsLab/CPGNN. |
| Open Datasets | Yes | We assign to the nodes feature vectors from the recently announced Open Graph Benchmark (Hu et al. 2020), which includes only graphs with homophily. ... For real-world graph data, we consider graphs with heterophily and homophily. We use 3 heterophilous graphs, namely Texas, Squirrel and Chameleon (Rozemberczki, Allen, and Sarkar 2019), and 3 widely adopted graphs with strong homophily, which are Cora, Pubmed and Citeseer (Sen et al. 2008; Namata et al. 2012). We use the features and class labels provided by Pei et al. (2020). |
| Dataset Splits | Yes | For synthetic experiments, we generate 3 synthetic graphs for every heterophily level h ∈ {0, 0.1, 0.2, . . . , 0.9, 1}. We then randomly select 10% of nodes in each class for training, 10% for validation, and 80% for testing, and report the average classification accuracy as performance of each model on all instances with the same level of heterophily. ... On real-world graphs, we generate 10 random splits for training, validation and test sets; for each split we randomly select 10% of nodes in each class to form the training set, with another 10% for the validation set and the remaining as the test set. |
| Hardware Specification | Yes | We gratefully acknowledge the support of NVIDIA Corporation with the donation of the Quadro P6000 GPU used for this research. |
| Software Dependencies | Yes | All models are trained with Python 3.8.5, PyTorch 1.8.0 with CUDA 11.1 and cuDNN 8005. We use Optuna for hyperparameter tuning. |
| Experiment Setup | Yes | For all models and datasets, we perform Bayesian Optimization on learning rates in the range [10^−4, 10^−2], weight decay in the range [10^−6, 10^−4], and dropout rates in the range [0.1, 0.5]. |