A Fair Comparison of Graph Neural Networks for Graph Classification

Authors: Federico Errica, Marco Podda, Davide Bacciu, Alessio Micheli

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental we ran more than 47000 experiments in a controlled and uniform framework to re-evaluate five popular models across nine common benchmarks.
Researcher Affiliation Academia Federico Errica Department of Computer Science University of Pisa federico.errica@phd.unipi.it Marco Podda Department of Computer Science University of Pisa marco.podda@di.unipi.it Davide Bacciu Department of Computer Science University of Pisa bacciu@di.unipi.it Alessio Micheli Department of Computer Science University of Pisa micheli@di.unipi.it
Pseudocode Yes Table 2: Pseudo-code for model assessment (left) and model selection (right). In Algorithm 1, Select refers to Algorithm 2, whereas Train and Eval represent training and inference phases, respectively.
Open Source Code Yes We publicly release code and dataset splits to reproduce our results, in order to allow other researchers to carry out rigorous evaluations with minimum additional effort1. 1Code available at: https://github.com/diningphil/gnn-comparison
Open Datasets Yes All graph datasets are publicly available (Kersting et al., 2016) and represent a relevant subset of those most frequently used in literature to compare GNNs.
Dataset Splits Yes Our experimental approach is to use a 10-fold CV for model assessment and an inner holdout technique with a 90%/10% training/validation split for model selection. Moreover, all data splits are stratified, i.e., class proportions are preserved inside each k-fold split as well as in the holdout splits used for model selection.
Hardware Specification No The paper mentions "extensive use of parallelism, both in CPU and GPU" but does not specify any particular models or configurations of the hardware used for experiments.
Software Dependencies No The paper states "All models have been implemented by means of the Pytorch Geometrics library (Fey & Lenssen, 2019)", but it does not specify the version numbers for Pytorch Geometrics or other critical software dependencies.
Experiment Setup Yes Hyper-parameter tuning is performed via grid search. For the sake of conciseness, we list all hyper-parameters in Section A.4. We select the number of convolutional layers, the embedding space dimension, the learning rate, and the criterion for early stopping (either based on the validation accuracy or validation loss) for all models.