reproducibilityindex.ai

Training Graph Neural Networks with 1000 Layers

Authors: Guohao Li, Matthias Müller, Bernard Ghanem, Vladlen Koltun

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our models Rev GNN-Deep (1001 layers with 80 channels each) and Rev GNN-Wide (448 layers with 224 channels each) were both trained on a single commodity GPU and achieve an ROC-AUC of 87.74 0.13 and 88.24 0.15 on the ogbn-proteins dataset.
Researcher Affiliation	Collaboration	1Intel Labs 2King Abdullah University of Science and Technology.
Pseudocode	No	The paper describes methods in text and equations but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	We release our implementation, which supports Py Torch Geometric (Fey & Lenssen, 2019) and the Deep Graph Library (Wang et al., 2019a).
Open Datasets	Yes	We conduct experiments on several datasets from the Open Graph Benchmark (OGB) (Hu et al., 2020). ...ogbn-proteins dataset from the Open Graph Benchmark (OGB) (Hu et et al., 2020).
Dataset Splits	Yes	We use mini-batch training with random partitioning where graphs are split into 10 parts during training and 5 parts during testing (Li et al., 2020). The data splits and evaluation metrics on all datasets follow the OGB evaluation protocol.
Hardware Specification	Yes	Our models Rev GNN-Deep... and Rev GNN-Wide... were both trained on a single commodity GPU... Rev GNN-Deep and Rev GNN-Wide take 13.5 days and 17.1 days, respectively, to train for 2000 epochs on a single NVIDIA V100. We perform the inferences on a NVIDIA RTX A6000 (48GB).
Software Dependencies	No	The implementation of all the reversible models is based on Py Torch (Paszke et al., 2019) and supports both Py Torch Geometric (Py G) (Fey & Lenssen, 2019) and Deep Graph Libray (DGL) (Wang et al., 2019a) frameworks.
Experiment Setup	Yes	We use the same GNN operator (Li et al., 2020), hyper-parameters (e.g. learning rate, dropout rate, training epoch, etc.), and optimizers to make the comparison as fair as possible. ... Rev GNN-Wide uses a larger dropout rate of 0.2 to prevent overﬁtting. ... ϵ is set to 10 6 BD and 2 10 10 BD for the forward pass and the backward pass respectively... The iteration thresholds in the forward pass and the backward pass are set to the same value.