Graph Neural Networks Use Graphs When They Shouldn’t
Authors: Maya Bechler-Speicher, Ido Amos, Ran Gilad-Bachrach, Amir Globerson
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We analyze the implicit bias of gradient-descent learning of GNNs and prove that when the ground truth function does not use the graphs, GNNs are not guaranteed to learn a solution that ignores the graph, even with infinite data. We examine this phenomenon with respect to different graph distributions and find that regular graphs are more robust to this overfitting. We also prove that within the family of regular graphs, GNNs are guaranteed to extrapolate when learning with gradient descent. Finally, based on our empirical and theoretical findings, we demonstrate on real-data how regular graphs can be leveraged to reduce graph overfitting and enhance performance. 2. GNNs Overfit the Graph-Structure In this section, we present an empirical evaluation showing that GNNs tend to overfit the graph-structure, thus hurting their generalization accuracy. |
| Researcher Affiliation | Collaboration | 1Blavatnik School of Computer Science, Tel-Aviv University 2School of Electrical Engineering, Tel-Aviv University 3Department of Bio-Medical Engineering and Edmond J. Safra Center for Bioinformatics, Tel-Aviv University 4Now also at Google Research. |
| Pseudocode | No | The paper does not contain any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | The code is available on https://github.com/mayabechlerspeicher/Graph_Neural_Networks_Overfit_Graphs |
| Open Datasets | Yes | Sum This is a binary classification synthetic task with a graph-less ground truth function. Proteins and Enzymes These are two classification tasks on real-world molecular data (Morris et al., 2020). ... We used 11 graph datasets, including two large-scale datasets... Enzymes, D&D, Proteins, NCI1 (Shervashidze et al., 2011) are datasets of chemical compounds... IMDB-B, IMDB-M, Collab, Reddit-B, Reddit-5k (Yanardag & Vishwanathan, 2015) are social network datasets. mol-hiv, mol-pcba (Hu et al., 2020) are large-scale datasets of molecular property prediction. |
| Dataset Splits | Yes | The GNNs architecture is fixed, and the learning hyperparameters are tuned on a validation set for the Sum task and 10-fold cross-validation for Protein and Enzymes. For all the datasets except mol-hiv and mol-pcba we used 10-fold nested cross-validation with the splits and protocol of Errica et al. (2022). |
| Hardware Specification | No | The paper does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for experiments. |
| Software Dependencies | No | The paper mentions 'Adam optimizer' but does not provide version numbers for any specific software components or libraries. |
| Experiment Setup | Yes | The GNNs architecture is fixed, and the learning hyperparameters are tuned on a validation set for the Sum task and 10-fold cross-validation for Protein and Enzymes. ... All GNNs use Re LU activations with {3, 5} layers and 64 hidden channels. They were trained with Adam optimizer over 1000 epochs and early on the validation loss with a patient of 100 steps, eight Decay of 1𝑒 4, learning rate in {1𝑒 3, 1𝑒 4}, dropout rate in {0, 0.5}, and a train batch size of 32. |