reproducibilityindex.ai

A Fair Comparison of Graph Neural Networks for Graph Classification

Authors: Federico Errica, Marco Podda, Davide Bacciu, Alessio Micheli

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	we ran more than 47000 experiments in a controlled and uniform framework to re-evaluate ﬁve popular models across nine common benchmarks.
Researcher Affiliation	Academia	Federico Errica Department of Computer Science University of Pisa federico.errica@phd.unipi.it Marco Podda Department of Computer Science University of Pisa marco.podda@di.unipi.it Davide Bacciu Department of Computer Science University of Pisa bacciu@di.unipi.it Alessio Micheli Department of Computer Science University of Pisa micheli@di.unipi.it
Pseudocode	Yes	Table 2: Pseudo-code for model assessment (left) and model selection (right). In Algorithm 1, Select refers to Algorithm 2, whereas Train and Eval represent training and inference phases, respectively.
Open Source Code	Yes	We publicly release code and dataset splits to reproduce our results, in order to allow other researchers to carry out rigorous evaluations with minimum additional effort1. 1Code available at: https://github.com/diningphil/gnn-comparison
Open Datasets	Yes	All graph datasets are publicly available (Kersting et al., 2016) and represent a relevant subset of those most frequently used in literature to compare GNNs.
Dataset Splits	Yes	Our experimental approach is to use a 10-fold CV for model assessment and an inner holdout technique with a 90%/10% training/validation split for model selection. Moreover, all data splits are stratiﬁed, i.e., class proportions are preserved inside each k-fold split as well as in the holdout splits used for model selection.
Hardware Specification	No	The paper mentions "extensive use of parallelism, both in CPU and GPU" but does not specify any particular models or configurations of the hardware used for experiments.
Software Dependencies	No	The paper states "All models have been implemented by means of the Pytorch Geometrics library (Fey & Lenssen, 2019)", but it does not specify the version numbers for Pytorch Geometrics or other critical software dependencies.
Experiment Setup	Yes	Hyper-parameter tuning is performed via grid search. For the sake of conciseness, we list all hyper-parameters in Section A.4. We select the number of convolutional layers, the embedding space dimension, the learning rate, and the criterion for early stopping (either based on the validation accuracy or validation loss) for all models.