Graph Neural Network-Inspired Kernels for Gaussian Processes in Semi-Supervised Learning

Authors: Zehao Niu, Mihai Anitescu, Jie Chen

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this work, we introduce this inductive bias into GPs to improve their predictive performance for graph-structured data. We show that a prominent example of GNNs, the graph convolutional network, is equivalent to some GP when its layers are infinitely wide; and we analyze the kernel universality and the limiting behavior in depth. We further present a programmable procedure to compose covariance kernels inspired by this equivalence and derive example kernels corresponding to several interesting members of the GNN family. We also propose a computationally efficient approximation of the covariance matrix for scalable posterior inference with large-scale data. We demonstrate that these graph-based kernels lead to competitive classification and regression performance, as well as advantages in computation time, compared with the respective GNNs. 6 EXPERIMENTS In this section, we conduct a comprehensive set of experiments to evaluate the performance of the GP kernels derived by taking limits on the layer width of GCN and other GNNs.
Researcher Affiliation Collaboration Zehao Niu1, Mihai Anitescu1,2 1University of Chicago, 2Argonne National Laboratory niuzehao@uchicago.edu anitescu@mcs.anl.gov Jie Chen MIT-IBM Watson AI Lab IBM Research chenjie@us.ibm.com
Pseudocode Yes Algorithm 1 Computing K(L) b K(L) = Q(L)Q(L)T
Open Source Code Yes Code is available at https://github.com/niuzehao/gnn-gp.
Open Datasets Yes The datasets Cora/Citeseer/Pub Med/Reddit, with predefined training/validation/test splits, are downloaded from the Py Torch Geometric library (Fey & Lenssen, 2019) and used as is. The dataset Ar Xiv comes from the Open Graph Benchmark (Hu et al., 2020b). The datasets Chameleon/Squirrel/Crocodile come from MUSAE (Rozemberczki et al., 2021).
Dataset Splits Yes The datasets Cora/Citeseer/Pub Med/Reddit, with predefined training/validation/test splits, are downloaded from the Py Torch Geometric library (Fey & Lenssen, 2019) and used as is. The training/validation/test splits of the former two sets of datasets come from Geom GCN (Pei et al., 2020), in accordance with the Py Torch Geometric library. The split for Crocodile is not available, so we conduct a random split with the same 0.48/0.32/0.20 proportion as that used for Chameleon and Squirrel (Rozemberczki et al., 2021).
Hardware Specification Yes All experiments are conducted on a Nvidia Quadro GV100 GPU with 32GB of HBM2 memory.
Software Dependencies Yes The code is written in Python 3.10.4 as distributed with Ubuntu 22.04 LTS. We use Py Torch 1.11.0 and Py Torch Geometric 2.1.0 with CUDA 11.3.
Experiment Setup Yes For classification tasks in Table 3, the hyperparameters are set to σb = 0.0, σw = 1.0, L = 2, hidden = 256, and dropput = 0.5. GCN is trained with learning rate 0.01. For regression tasks, they are set to σb = 0.1, σw = 1.0, L = 2, hidden = 256, and dropput = 0.5. GCN is trained with learning rate 0.01.