GNNAutoScale: Scalable and Expressive Graph Neural Networks via Historical Embeddings

Authors: Matthias Fey, Jan E. Lenssen, Frank Weichert, Jure Leskovec

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirically, we show that the practical realization of our framework, Py GAS, an easy-to-use extension for PYTORCH GEOMETRIC, is both fast and memory-efficient, learns expressive node representations, closely resembles the performance of their non-scaling counterparts, and reaches stateof-the-art performance on large-scale graphs.
Researcher Affiliation Academia Matthias Fey 1 Jan Eric Lenssen 1 Frank Weichert 1 Jure Leskovec 2 1Department of Computer Science, TU Dortmund University 2Department of Computer Science, Stanford University.
Pseudocode Yes cf. our training algorithm in the appendix.
Open Source Code Yes We implement our framework practically as Py GAS1, an extension for the PYTORCH GEOMETRIC library (Fey & Lenssen, 2019), which makes it easy to convert common and custom GNN models into their scalable variants and to apply them to large-scale graphs. 1https://github.com/rusty1s/pyg_autoscale
Open Datasets Yes In this section, we evaluate our GAS framework in practice using Py GAS, utilizing 6 different GNN operators and 15 datasets.
Dataset Splits No The paper mentions evaluating models on datasets and using mini-batch training, but it does not provide specific percentages or counts for training, validation, and test splits within the main text, nor does it refer to predefined splits with citations for all datasets used. It refers to 'our code for hyperparameter configurations' for details which might include splits.
Hardware Specification Yes All models were trained on a single Ge Force RTX 2080 Ti (11 GB). In our experiments, we hold all histories in RAM, using a machine with 64GB of CPU memory.
Software Dependencies No The paper mentions using PYTORCH and PYTORCH GEOMETRIC (Py G) libraries but does not provide specific version numbers for these software dependencies, which are required for full reproducibility.
Experiment Setup No The paper states 'Please refer to the appendix for a detailed description of the used GNN operators and datasets, and to our code for hyperparameter configurations.' and 'For all experiments, we tried to follow the hyperparameter setup of the respective papers as closely as possible and perform an in-depth grid search on datasets for which best performing configurations are not known.' However, it does not explicitly provide these specific hyperparameter values or training configurations within the main text.