Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Is Homophily a Necessity for Graph Neural Networks?

Authors: Yao Ma, Xiaorui Liu, Neil Shah, Jiliang Tang

ICLR 2022 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental However, we empirically find that standard graph convolutional networks (GCNs) can actually achieve strong performance on some commonly used heterophilous graphs. This motivates us to reconsider whether homophily is truly necessary for good GNN performance. We find that this claim is not quite accurate, and certain types of good heterophily exist, under which GCNs can achieve strong performance. Our work carefully characterizes the implications of different heterophily conditions, and provides supporting theoretical understanding and empirical observations.
Researcher Affiliation Collaboration Yao Ma New Jersey Institute of Technology EMAIL Xiaorui Liu Michigan State University EMAIL Neil Shah Snap Inc. EMAIL Jiliang Tang Michigan State University EMAIL
Pseudocode Yes Alg. 1: Hetero. Edge Addition Alg. 2: Heterophilous Edge Addition with Noise
Open Source Code No The paper references code for other models (H2GCN, GPR-GNN, CPGNN) that they adopted, but does not state that they provide their own source code for the methodology or experiments described in the paper.
Open Datasets Yes We include the citation networks Cora, Citeseer and Pubmed (Kipf and Welling, 2016), which are highly homophilous. We also adopt several heterophilous benchmark datasets including Chameleon, Squirrel, Actor, Cornell, Wisconsin and Texas (Rozemberczki et al., 2021; Pei et al., 2020).
Dataset Splits Yes For all datasets, we follow the experimental setting provided in (Pei et al., 2020), which consists of 10 random splits with proportions 48/32/20% corresponding to training/validation/test for each graph.
Hardware Specification Yes All experiments are run on a cluster equipped with Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz CPUs and NVIDIA Tesla K80 GPUs.
Software Dependencies No The paper mentions adapting codebases for other models (H2GCN, GPR-GNN, CPGNN) and implicitly uses deep learning frameworks, but it does not specify any software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes We tune parameters for GCN, GPR-GNN, CPGNN, and MLP+GCN from the following options: learning rate: {0.002, 0.005, 0.01, 0.05} weight decay {5e 04, 5e 05, 5e 06, 5e 07, 5e 08, 1e 05, 0} dropout rate: {0, 0.2, 0.5, 0.8}. For GPR-GNN, we use the PPR as the initialization for the coefficients. For MLP+GCN, we tune α from {0.2, 0.4, 0.6, 0.8, 1}.