Simplified Graph Convolution with Heterophily
Authors: Sudhanshu Chanpuriya, Cameron Musco
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Here we confirm that SGC is indeed ineffective for heterophilous (i.e., non-homophilous) graphs via experiments on synthetic and real-world datasets. |
| Researcher Affiliation | Academia | Sudhanshu Chanpuriya University of Massachusetts Amherst schanpuriya@cs.umass.edu Cameron Musco University of Massachusetts Amherst cmusco@cs.umass.edu |
| Pseudocode | Yes | Algorithm 1 Adaptive Simple Graph Convolution (ASGC) Filter |
| Open Source Code | Yes | We release code in the form of a Jupyter notebook (Pérez & Granger, 2007) demo which is available at github.com/schariya/adaptive-simple-convolution. |
| Open Datasets | Yes | We experiment on 10 commonly-used datasets, the same collection of datasets as Chien et al. (2021). CORA, CITESEER, and PUBMED are citation networks which are common benchmarks for node classification (Sen et al., 2008; Namata et al., 2012)... Like Chien et al. (2021), we use random 60%/20%/20% splits as training/validation/test data for the 5 heterophilous datasets, as in Pei et al. (2020), and use random 2.5%/2.5%/95% splits for the homophilous datasets, which is closer to the original setting from Kipf & Welling (2017) and Shchur et al. (2018). |
| Dataset Splits | Yes | Like Chien et al. (2021), we use random 60%/20%/20% splits as training/validation/test data for the 5 heterophilous datasets, as in Pei et al. (2020), and use random 2.5%/2.5%/95% splits for the homophilous datasets, which is closer to the original setting from Kipf & Welling (2017) and Shchur et al. (2018). |
| Hardware Specification | No | The paper states: 'As should be clear from the time complexity discussion in Section 3 and the dataset statistics in Table 1, our method is lightweight enough to run the benchmarks within a few hours on a laptop without a GPU, so compute is not a significant concern.' However, it does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory amounts. |
| Software Dependencies | No | The paper mentions: 'The SGC and ASGC algorithms are implemented in Python using Num Py (Harris et al., 2020) for least squares regression and other linear algebraic computations. We use scikit-learn (Pedregosa et al., 2011) for logistic regression with 1,000 maximum iterations and otherwise default settings.' However, it does not provide specific version numbers for Python, NumPy, or scikit-learn. |
| Experiment Setup | Yes | We tune the number of hops over K {1, 2, 4, 8}, roughly covering the range analyzed in Wu et al. (2019), and the regularization strength R = n R over log10(R ) { 4, 3, 2, 1, 0}. This dependency on the number of nodes n allows the regularization loss to scale with the least squares loss, which generally grows linearly with n. In Appendix 9.2, we report results for some additional experiments investigating the effect of fixing these hyperparameters. ... For our implementations of SGC and ASGC, we treat each network as undirected, in that if edge (i, j) appears, we also include edge (j, i). Like Chien et al. (2021), we use random 60%/20%/20% splits as training/validation/test data for the 5 heterophilous datasets, as in Pei et al. (2020), and use random 2.5%/2.5%/95% splits for the homophilous datasets, which is closer to the original setting from Kipf & Welling (2017) and Shchur et al. (2018). |