Graph Neural Networks Do Not Always Oversmooth

Authors: Bastian Epping, Alexandre René, Moritz Helias, Michael T Schaub

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate the validity of this prediction in finite-size GCNs by training a linear classifier on their output. Moreover, using the linearization of the GCN GP, we generalize the concept of propagation depth of information from DNNs to GCNs. This propagation depth diverges at the transition between the oversmoothing and non-oversmoothing phase. We test the predictions of our approach and find good agreement with finite-size GCNs.
Researcher Affiliation Academia 1RWTH Aachen University, Aachen, Germany 2Department of Physics, RWTH Aachen University, Aachen, Germany 3Institute for Advanced Simulation (IAS-6), Computational and Systems Neuroscience, Jülich Research Centre, Jülich, Germany
Pseudocode No The paper contains mathematical equations and derivations but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code Yes The code is publicly available under https://github.com/bepping/non-oversmoothing-gcns.
Open Datasets Yes As a proof of concept, we demonstrate this approach for the Contextual Stochastic Block Model (CSBM) [8], a common synthetic model which allows generating a graph with two communities and community-wise correlated features on the nodes. Additionally, we test the performance of the GCN GP on the real world citation network Cora [38].
Dataset Splits No In a semi-supervised node classification setting, we split the underlying graph into N test unlabeled test nodes and N train labeled training nodes (N test + N train = N); we correspondingly split the output random variable Yi RN for output dimension i into Y i RN test and Y D i RN train. In all panels we use N train = 10 training nodes and N test = 10 test nodes, five training nodes from each of the two communities.
Hardware Specification No Computations were performed on CPUs. Requirements for the experiments with synthetic data are (approximately): Figure 1: 10mins on a single core laptop. Figure 2: 10h on a single node on an internal CPU cluster.
Software Dependencies No To conduct our experiments we use Num Py [13], Sci Py [42] (both available under a BSD-3-Clause License) and Scikit-learn (sklearn) [31] (available under a New BSD License).
Experiment Setup Yes Generalization error (mean squared error) of the Gaussian process for a CSBM with parameters N = 20, d = 5, λ = 1, γ = 1 and µ = 4. The shift operator is defined in (6) with g = 0.1, other parameters are σ2 b = 0, ϕ(x) = erf( π 2 x) and σro = 0.01.