GRAND: Graph Neural Diffusion
Authors: Ben Chamberlain, James Rowbottom, Maria I Gorinova, Michael Bronstein, Stefan Webb, Emanuele Rossi
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We develop linear and nonlinear versions of GRAND, which achieve competitive results on many standard graph benchmarks.We show detailed ablation studies shedding light on the choice of numerical schemes and parameters.Performance Tables 1 2 summarise the results of our experiments. GRAND variants consistently perform among the best methods, achieving first place on all but one dataset, where it is second. |
| Researcher Affiliation | Collaboration | 1Twitter Inc., London, UK 2Imperial College London, UK 3IDSIA/USI, Switzerland. Correspondence to: Ben Chamberlain <bchamberlain@twitter.com>. |
| Pseudocode | No | The paper describes its methods using mathematical equations and textual descriptions but does not include formal pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code and instructions to reproduce the experiments are available at https://github.com/twitterresearch/graph-neural-pde. |
| Open Datasets | Yes | Datasets We report results for the most widely used citation networks Cora (Mc Callum et al., 2000), Citeseer (Sen et al., 2008), Pubmed (Namata et al., 2012)... Additional datasets are the coauthor graph Coauthor CS (Shchur et al., 2018), the Amazon co-purchasing graphs Computer and Photo (Mc Auley et al., 2015), and the OGB arxiv dataset (Hu et al., 2020). |
| Dataset Splits | Yes | These datasets contain fixed splits that are often used, which we include for direct comparison in Table 1. To address the limitations of this evaluation methodology (Shchur et al., 2018), we also report results for all datasets using 100 random splits with 20 random initializations.Experimental setup We follow the experimental methodology described in (Shchur et al., 2018) using 20 random weight initializations for datasets with fixed Planetoid splits and 100 random splits for the remaining datasets. |
| Hardware Specification | Yes | Experiments ran on AWS p2.8xlarge machines, each with 8 Tesla V100-SXM2 GPUs. |
| Software Dependencies | No | GRAND is implemented in Py Torch (Paszke et al., 2019), using Py Torch geometric (Fey & Lenssen, 2019) and torchdiffeq (Chen et al., 2018). Specific version numbers for these software dependencies are not provided. |
| Experiment Setup | Yes | Hyperparameter search used Ray Tune (Liaw et al., 2018) with a thousand random trials using an asynchronous hyperband scheduler with a grace period of ten epochs and a half life of ten epochs.For smaller datasets (Cora, Citeseer) we used the Anode augmentation scheme (Dupont et al., 2019) to stabilise training. The ogb-arxiv dataset used the Runge-Kutta method, for all others Dormand-Prince was used. For the larger datasets, we used kinetic energy and Jacobian regularization (Finlay et al., 2020; Kelly et al., 2020). |