Graph Neural Networks Do Not Always Oversmooth
Authors: Bastian Epping, Alexandre René, Moritz Helias, Michael T Schaub
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the validity of this prediction in finite-size GCNs by training a linear classifier on their output. Moreover, using the linearization of the GCN GP, we generalize the concept of propagation depth of information from DNNs to GCNs. This propagation depth diverges at the transition between the oversmoothing and non-oversmoothing phase. We test the predictions of our approach and find good agreement with finite-size GCNs. |
| Researcher Affiliation | Academia | 1RWTH Aachen University, Aachen, Germany 2Department of Physics, RWTH Aachen University, Aachen, Germany 3Institute for Advanced Simulation (IAS-6), Computational and Systems Neuroscience, Jülich Research Centre, Jülich, Germany |
| Pseudocode | No | The paper contains mathematical equations and derivations but does not include any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | The code is publicly available under https://github.com/bepping/non-oversmoothing-gcns. |
| Open Datasets | Yes | As a proof of concept, we demonstrate this approach for the Contextual Stochastic Block Model (CSBM) [8], a common synthetic model which allows generating a graph with two communities and community-wise correlated features on the nodes. Additionally, we test the performance of the GCN GP on the real world citation network Cora [38]. |
| Dataset Splits | No | In a semi-supervised node classification setting, we split the underlying graph into N test unlabeled test nodes and N train labeled training nodes (N test + N train = N); we correspondingly split the output random variable Yi RN for output dimension i into Y i RN test and Y D i RN train. In all panels we use N train = 10 training nodes and N test = 10 test nodes, five training nodes from each of the two communities. |
| Hardware Specification | No | Computations were performed on CPUs. Requirements for the experiments with synthetic data are (approximately): Figure 1: 10mins on a single core laptop. Figure 2: 10h on a single node on an internal CPU cluster. |
| Software Dependencies | No | To conduct our experiments we use Num Py [13], Sci Py [42] (both available under a BSD-3-Clause License) and Scikit-learn (sklearn) [31] (available under a New BSD License). |
| Experiment Setup | Yes | Generalization error (mean squared error) of the Gaussian process for a CSBM with parameters N = 20, d = 5, λ = 1, γ = 1 and µ = 4. The shift operator is defined in (6) with g = 0.1, other parameters are σ2 b = 0, ϕ(x) = erf( π 2 x) and σro = 0.01. |