A Fair Comparison of Graph Neural Networks for Graph Classification
Authors: Federico Errica, Marco Podda, Davide Bacciu, Alessio Micheli
ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | we ran more than 47000 experiments in a controlled and uniform framework to re-evaluate five popular models across nine common benchmarks. |
| Researcher Affiliation | Academia | Federico Errica Department of Computer Science University of Pisa federico.errica@phd.unipi.it Marco Podda Department of Computer Science University of Pisa marco.podda@di.unipi.it Davide Bacciu Department of Computer Science University of Pisa bacciu@di.unipi.it Alessio Micheli Department of Computer Science University of Pisa micheli@di.unipi.it |
| Pseudocode | Yes | Table 2: Pseudo-code for model assessment (left) and model selection (right). In Algorithm 1, Select refers to Algorithm 2, whereas Train and Eval represent training and inference phases, respectively. |
| Open Source Code | Yes | We publicly release code and dataset splits to reproduce our results, in order to allow other researchers to carry out rigorous evaluations with minimum additional effort1. 1Code available at: https://github.com/diningphil/gnn-comparison |
| Open Datasets | Yes | All graph datasets are publicly available (Kersting et al., 2016) and represent a relevant subset of those most frequently used in literature to compare GNNs. |
| Dataset Splits | Yes | Our experimental approach is to use a 10-fold CV for model assessment and an inner holdout technique with a 90%/10% training/validation split for model selection. Moreover, all data splits are stratified, i.e., class proportions are preserved inside each k-fold split as well as in the holdout splits used for model selection. |
| Hardware Specification | No | The paper mentions "extensive use of parallelism, both in CPU and GPU" but does not specify any particular models or configurations of the hardware used for experiments. |
| Software Dependencies | No | The paper states "All models have been implemented by means of the Pytorch Geometrics library (Fey & Lenssen, 2019)", but it does not specify the version numbers for Pytorch Geometrics or other critical software dependencies. |
| Experiment Setup | Yes | Hyper-parameter tuning is performed via grid search. For the sake of conciseness, we list all hyper-parameters in Section A.4. We select the number of convolutional layers, the embedding space dimension, the learning rate, and the criterion for early stopping (either based on the validation accuracy or validation loss) for all models. |