reproducibilityindex.ai

Graph Neural Network-Inspired Kernels for Gaussian Processes in Semi-Supervised Learning

Authors: Zehao Niu, Mihai Anitescu, Jie Chen

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this work, we introduce this inductive bias into GPs to improve their predictive performance for graph-structured data. We show that a prominent example of GNNs, the graph convolutional network, is equivalent to some GP when its layers are inﬁnitely wide; and we analyze the kernel universality and the limiting behavior in depth. We further present a programmable procedure to compose covariance kernels inspired by this equivalence and derive example kernels corresponding to several interesting members of the GNN family. We also propose a computationally efﬁcient approximation of the covariance matrix for scalable posterior inference with large-scale data. We demonstrate that these graph-based kernels lead to competitive classiﬁcation and regression performance, as well as advantages in computation time, compared with the respective GNNs. 6 EXPERIMENTS In this section, we conduct a comprehensive set of experiments to evaluate the performance of the GP kernels derived by taking limits on the layer width of GCN and other GNNs.
Researcher Affiliation	Collaboration	Zehao Niu1, Mihai Anitescu1,2 1University of Chicago, 2Argonne National Laboratory niuzehao@uchicago.edu anitescu@mcs.anl.gov Jie Chen MIT-IBM Watson AI Lab IBM Research chenjie@us.ibm.com
Pseudocode	Yes	Algorithm 1 Computing K(L) b K(L) = Q(L)Q(L)T
Open Source Code	Yes	Code is available at https://github.com/niuzehao/gnn-gp.
Open Datasets	Yes	The datasets Cora/Citeseer/Pub Med/Reddit, with predeﬁned training/validation/test splits, are downloaded from the Py Torch Geometric library (Fey & Lenssen, 2019) and used as is. The dataset Ar Xiv comes from the Open Graph Benchmark (Hu et al., 2020b). The datasets Chameleon/Squirrel/Crocodile come from MUSAE (Rozemberczki et al., 2021).
Dataset Splits	Yes	The datasets Cora/Citeseer/Pub Med/Reddit, with predeﬁned training/validation/test splits, are downloaded from the Py Torch Geometric library (Fey & Lenssen, 2019) and used as is. The training/validation/test splits of the former two sets of datasets come from Geom GCN (Pei et al., 2020), in accordance with the Py Torch Geometric library. The split for Crocodile is not available, so we conduct a random split with the same 0.48/0.32/0.20 proportion as that used for Chameleon and Squirrel (Rozemberczki et al., 2021).
Hardware Specification	Yes	All experiments are conducted on a Nvidia Quadro GV100 GPU with 32GB of HBM2 memory.
Software Dependencies	Yes	The code is written in Python 3.10.4 as distributed with Ubuntu 22.04 LTS. We use Py Torch 1.11.0 and Py Torch Geometric 2.1.0 with CUDA 11.3.
Experiment Setup	Yes	For classiﬁcation tasks in Table 3, the hyperparameters are set to σb = 0.0, σw = 1.0, L = 2, hidden = 256, and dropput = 0.5. GCN is trained with learning rate 0.01. For regression tasks, they are set to σb = 0.1, σw = 1.0, L = 2, hidden = 256, and dropput = 0.5. GCN is trained with learning rate 0.01.