Generalization Error of Graph Neural Networks in the Mean-field Regime

Authors: Gholamali Aminian, Yixuan He, Gesine Reinert, Lukasz Szpruch, Samuel N. Cohen

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We carry out an empirical analysis on both synthetic and real-world data sets.
Researcher Affiliation Academia 1The Alan Turing Institute, London, United Kingdom 2Department of Statistics, University of Oxford, Oxford, United Kingdom 3School of Mathematics, University of Edinburgh 4Mathematical Institute, University of Oxford, Oxford, United Kingdom.
Pseudocode No The paper does not contain a pseudocode or algorithm block.
Open Source Code Yes Code. Implementation code is provided at https://github.com/Sheryl HYX/GNN_MF_GE.
Open Datasets Yes For synthetic data sets, we generate three types of Stochastic Block Models (SBMs) and two types of Erd os-Rényi (ER) models, with 200 graphs for each type. We also conduct experiments on a bioinformatics data set called PROTEINS (Borgwardt et al., 2005).
Dataset Splits Yes We use a supervised ratio of βsup {0.7, 0.9} for h {4, 8, 16, 32, 64, 126, 256} for our experiments detailed in App. F. For each random graph of an individual synthetic data set, we generate the 16-dimension random Gaussian node feature (normalized to have unit ℓ2 norm) and a binary class label following a uniform distribution. The random train-test split ratio is βsup : (1 βsup), where in our experiments we vary βsup in {0.7, 0.9}.
Hardware Specification Yes Hardware and setup. Experiments were conducted on two compute nodes, each with 8 Nvidia Tesla T4 GPUs, 96 Intel Xeon Platinum 8259CL CPUs @ 2.50GHz and 378GB RAM.
Software Dependencies No The paper mentions using PyTorch but does not specify a version number for it or any other key software dependencies.
Experiment Setup Yes Training. We train 200 epochs for synthetic data sets and 50 epochs for PROTEINS. The batch size is 128 for all data sets. Optimizer. Taking the regularization term into account, we use Stochastic Gradient Descend (SGD) from Py Torch as the optimizer and ℓ2 regularization with weight decay 1 hα to avoid overfitting, where h is the width of the hidden layer, and α is a tuning parameter which we set to be 100. We use a learning rate of 0.005 and a momentum of 0.9 throughout.