Graph Neural Network-Inspired Kernels for Gaussian Processes in Semi-Supervised Learning
Authors: Zehao Niu, Mihai Anitescu, Jie Chen
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this work, we introduce this inductive bias into GPs to improve their predictive performance for graph-structured data. We show that a prominent example of GNNs, the graph convolutional network, is equivalent to some GP when its layers are infinitely wide; and we analyze the kernel universality and the limiting behavior in depth. We further present a programmable procedure to compose covariance kernels inspired by this equivalence and derive example kernels corresponding to several interesting members of the GNN family. We also propose a computationally efficient approximation of the covariance matrix for scalable posterior inference with large-scale data. We demonstrate that these graph-based kernels lead to competitive classification and regression performance, as well as advantages in computation time, compared with the respective GNNs. 6 EXPERIMENTS In this section, we conduct a comprehensive set of experiments to evaluate the performance of the GP kernels derived by taking limits on the layer width of GCN and other GNNs. |
| Researcher Affiliation | Collaboration | Zehao Niu1, Mihai Anitescu1,2 1University of Chicago, 2Argonne National Laboratory niuzehao@uchicago.edu anitescu@mcs.anl.gov Jie Chen MIT-IBM Watson AI Lab IBM Research chenjie@us.ibm.com |
| Pseudocode | Yes | Algorithm 1 Computing K(L) b K(L) = Q(L)Q(L)T |
| Open Source Code | Yes | Code is available at https://github.com/niuzehao/gnn-gp. |
| Open Datasets | Yes | The datasets Cora/Citeseer/Pub Med/Reddit, with predefined training/validation/test splits, are downloaded from the Py Torch Geometric library (Fey & Lenssen, 2019) and used as is. The dataset Ar Xiv comes from the Open Graph Benchmark (Hu et al., 2020b). The datasets Chameleon/Squirrel/Crocodile come from MUSAE (Rozemberczki et al., 2021). |
| Dataset Splits | Yes | The datasets Cora/Citeseer/Pub Med/Reddit, with predefined training/validation/test splits, are downloaded from the Py Torch Geometric library (Fey & Lenssen, 2019) and used as is. The training/validation/test splits of the former two sets of datasets come from Geom GCN (Pei et al., 2020), in accordance with the Py Torch Geometric library. The split for Crocodile is not available, so we conduct a random split with the same 0.48/0.32/0.20 proportion as that used for Chameleon and Squirrel (Rozemberczki et al., 2021). |
| Hardware Specification | Yes | All experiments are conducted on a Nvidia Quadro GV100 GPU with 32GB of HBM2 memory. |
| Software Dependencies | Yes | The code is written in Python 3.10.4 as distributed with Ubuntu 22.04 LTS. We use Py Torch 1.11.0 and Py Torch Geometric 2.1.0 with CUDA 11.3. |
| Experiment Setup | Yes | For classification tasks in Table 3, the hyperparameters are set to σb = 0.0, σw = 1.0, L = 2, hidden = 256, and dropput = 0.5. GCN is trained with learning rate 0.01. For regression tasks, they are set to σb = 0.1, σw = 1.0, L = 2, hidden = 256, and dropput = 0.5. GCN is trained with learning rate 0.01. |