Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Is Homophily a Necessity for Graph Neural Networks?
Authors: Yao Ma, Xiaorui Liu, Neil Shah, Jiliang Tang
ICLR 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | However, we empirically find that standard graph convolutional networks (GCNs) can actually achieve strong performance on some commonly used heterophilous graphs. This motivates us to reconsider whether homophily is truly necessary for good GNN performance. We find that this claim is not quite accurate, and certain types of good heterophily exist, under which GCNs can achieve strong performance. Our work carefully characterizes the implications of different heterophily conditions, and provides supporting theoretical understanding and empirical observations. |
| Researcher Affiliation | Collaboration | Yao Ma New Jersey Institute of Technology EMAIL Xiaorui Liu Michigan State University EMAIL Neil Shah Snap Inc. EMAIL Jiliang Tang Michigan State University EMAIL |
| Pseudocode | Yes | Alg. 1: Hetero. Edge Addition Alg. 2: Heterophilous Edge Addition with Noise |
| Open Source Code | No | The paper references code for other models (H2GCN, GPR-GNN, CPGNN) that they adopted, but does not state that they provide their own source code for the methodology or experiments described in the paper. |
| Open Datasets | Yes | We include the citation networks Cora, Citeseer and Pubmed (Kipf and Welling, 2016), which are highly homophilous. We also adopt several heterophilous benchmark datasets including Chameleon, Squirrel, Actor, Cornell, Wisconsin and Texas (Rozemberczki et al., 2021; Pei et al., 2020). |
| Dataset Splits | Yes | For all datasets, we follow the experimental setting provided in (Pei et al., 2020), which consists of 10 random splits with proportions 48/32/20% corresponding to training/validation/test for each graph. |
| Hardware Specification | Yes | All experiments are run on a cluster equipped with Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz CPUs and NVIDIA Tesla K80 GPUs. |
| Software Dependencies | No | The paper mentions adapting codebases for other models (H2GCN, GPR-GNN, CPGNN) and implicitly uses deep learning frameworks, but it does not specify any software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | We tune parameters for GCN, GPR-GNN, CPGNN, and MLP+GCN from the following options: learning rate: {0.002, 0.005, 0.01, 0.05} weight decay {5e 04, 5e 05, 5e 06, 5e 07, 5e 08, 1e 05, 0} dropout rate: {0, 0.2, 0.5, 0.8}. For GPR-GNN, we use the PPR as the initialization for the coefficients. For MLP+GCN, we tune α from {0.2, 0.4, 0.6, 0.8, 1}. |