reproducibilityindex.ai

Graph Neural Networks with Heterophily

Authors: Jiong Zhu, Ryan A. Rossi, Anup Rao, Tung Mai, Nedim Lipka, Nesreen K. Ahmed, Danai Koutra11168-11176

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our extensive experiments demonstrate the effectiveness of our approach in more realistic and challenging experimental settings with signiﬁcantly less training data compared to previous works: CPGNN variants achieve state-of-the-art results in heterophily settings with or without contextual node features, while maintaining comparable performance in homophily settings.
Researcher Affiliation	Collaboration	1University of Michigan, Ann Arbor, USA 2Adobe Research, San Jose, USA 3Intel Labs, Santa Clara, USA
Pseudocode	No	The paper does not include a dedicated section for pseudocode or a clearly labeled algorithm block.
Open Source Code	Yes	We release CPGNN at https://github.com/GemsLab/CPGNN.
Open Datasets	Yes	We assign to the nodes feature vectors from the recently announced Open Graph Benchmark (Hu et al. 2020), which includes only graphs with homophily. ... For real-world graph data, we consider graphs with heterophily and homophily. We use 3 heterophilous graphs, namely Texas, Squirrel and Chameleon (Rozemberczki, Allen, and Sarkar 2019), and 3 widely adopted graphs with strong homophily, which are Cora, Pubmed and Citeseer (Sen et al. 2008; Namata et al. 2012). We use the features and class labels provided by Pei et al. (2020).
Dataset Splits	Yes	For synthetic experiments, we generate 3 synthetic graphs for every heterophily level h ∈ {0, 0.1, 0.2, . . . , 0.9, 1}. We then randomly select 10% of nodes in each class for training, 10% for validation, and 80% for testing, and report the average classiﬁcation accuracy as performance of each model on all instances with the same level of heterophily. ... On real-world graphs, we generate 10 random splits for training, validation and test sets; for each split we randomly select 10% of nodes in each class to form the training set, with another 10% for the validation set and the remaining as the test set.
Hardware Specification	Yes	We gratefully acknowledge the support of NVIDIA Corporation with the donation of the Quadro P6000 GPU used for this research.
Software Dependencies	Yes	All models are trained with Python 3.8.5, PyTorch 1.8.0 with CUDA 11.1 and cuDNN 8005. We use Optuna for hyperparameter tuning.
Experiment Setup	Yes	For all models and datasets, we perform Bayesian Optimization on learning rates in the range [10^−4, 10^−2], weight decay in the range [10^−6, 10^−4], and dropout rates in the range [0.1, 0.5].