OOD Link Prediction Generalization Capabilities of Message-Passing GNNs in Larger Test Graphs
Authors: Yangze Zhou, Gitta Kutyniok, Bruno Ribeiro
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | This work provides the first theoretical study on the ability of graph Message Passing Neural Networks (g MPNNs) such as Graph Neural Networks (GNNs) to perform inductive out-of-distribution (OOD) link prediction tasks, where deployment (test) graph sizes are larger than training graphs. We first prove nonasymptotic bounds showing that link predictors based on permutation-equivariant (structural) node embeddings obtained by g MPNNs can converge to a random guess as test graphs get larger. We then propose a theoretically-sound g MPNN that outputs structural pairwise (2-node) embeddings and prove non-asymptotic bounds showing that, as test graphs grow, these embeddings converge to embeddings of a continuous function that retains its ability to predict links OOD. Empirical results on random graphs show agreement with our theoretical results. |
| Researcher Affiliation | Academia | Yangze Zhou Department of Statistics Purdue University West Lafayette, IN 47903 zhou950@purdue.edu Gitta Kutyniok Department of Mathematics Ludwig-Maximilians-Universitat München Munich, Germany kutyniok@math.lmu.de Bruno Ribeiro Department of Computer Science Purdue University West Lafayette, IN 47903 ribeiro@cs.purdue.edu |
| Pseudocode | No | The paper provides mathematical definitions and descriptions of algorithms but no explicitly labeled 'Pseudocode' or 'Algorithm' blocks formatted as code. |
| Open Source Code | Yes | We implement all our models in Pytorch Geometric [19] and make it available1. (Footnote 1: https://github.com/yangzez/OOD-Link-Prediction-Generalization-MPNN) |
| Open Datasets | Yes | We start by sampling the training graph (Gtr, F tr) with N tr = 103 nodes. ...Link prediction performance evaluation with ogbl-ddi (in-distribution and OOD). |
| Dataset Splits | Yes | Then we split Ehid-tr into positive train (80%) and validation (10%) edges (we reserve 10% of Ehid-tr for the transductive test scenario), and uniformly sample the same number of across-block non-edges as negative train and validation edges. |
| Hardware Specification | Yes | All experiments are done on a single NVIDIA GeForce RTX 3090 GPU. |
| Software Dependencies | No | The paper mentions 'Pytorch Geometric [19]' but does not specify its version number, nor does it provide version numbers for any other software dependencies. |
| Experiment Setup | Yes | Our Graph SAGE, GCN, GAT and GIN models (GNN) all have 2 layers with hidden dimension 128. For the link prediction, we use a 3-layer neural network (MLP) with 10 hidden units per layer. The learning rate is 0.001 for all tasks and models with Adam optimizer, with early stopping if validation loss does not drop for 50 epochs, with a maximum of 500 epochs. Batch size is 128. |