Decoupling the Depth and Scope of Graph Neural Networks

Authors: Hanqing Zeng, Muhan Zhang, Yinglong Xia, Ajitesh Srivastava, Andrey Malevich, Rajgopal Kannan, Viktor Prasanna, Long Jin, Ren Chen

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirically, on seven graphs (with up to 110M nodes) and six backbone GNN architectures, our design achieves significant accuracy improvement with orders of magnitude reduction in computation and hardware cost.
Researcher Affiliation Collaboration Hanqing Zeng USC zengh@usc.edu Muhan Zhang Peking University, BIGAI muhan@pku.edu.cn Yinglong Xia Facebook AI yxia@fb.com Ajitesh Srivastava USC ajiteshs@usc.edu Andrey Malevich Facebook AI amalevich@fb.com Rajgopal Kannan US ARL rajgopal.kannan.civ@mail.mil Viktor Prasanna USC prasanna@usc.edu Long Jin Facebook AI longjin@fb.com Ren Chen Facebook AI renchen@fb.com
Pseudocode Yes See Appendix D and F.3 for algorithm and experiments.
Open Source Code Yes Our code is available at https://github.com/facebookresearch/shaDow_GNN
Open Datasets Yes We evaluate SHADOW-GNN on seven graphs. Six of them are for the node classification task: Flickr [55], Reddit [12], Yelp [55], ogbn-arxiv, ogbn-products and ogbn-papers100M [16].
Dataset Splits Yes We follow the default data splits for all datasets, which are usually 60% training, 20% validation, and 20% test. For ogbn-papers100M, the training, validation, test splits are 80%, 10%, 10% respectively. (from Appendix E.1)
Hardware Specification No Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [No]
Software Dependencies No The paper mentions using W&B [5] but does not provide specific version numbers for software dependencies or libraries in the text.
Experiment Setup Yes All models on all datasets have uniform hidden dimension of 256. [...] For the model depth, since L = 3 is the standard setting in the literature (e.g., see the benchmarking in OGB [16]), we start from L = 3 and further evaluate a deeper model of L = 5. Hyperparameter tuning and architecture configurations are in Appendix E.4. (Appendix E.4 specifies: 'hidden dimension of 256 for all models', 'learning rate of 0.001', 'Adam optimizer [21] with weight decay of 5e-4', 'number of training epochs is 1000', 'dropout rate set to 0.5')