GraphSAINT: Graph Sampling Based Inductive Learning Method
Authors: Hanqing Zeng, Hongkuan Zhou, Ajitesh Srivastava, Rajgopal Kannan, Viktor Prasanna
ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Graph SAINT demonstrates superior performance in both accuracy and training time on five large graphs, and achieves new state-of-the-art F1 scores for PPI (0.995) and Reddit (0.970). |
| Researcher Affiliation | Collaboration | Hanqing Zeng University of Southern California zengh@usc.edu Hongkuan Zhou University of Southern California hongkuaz@usc.edu Ajitesh Srivastava University of Southern California ajiteshs@usc.edu Rajgopal Kannan US Army Research Lab rajgopal.kannan.civ@mail.mil Viktor Prasanna University of Southern California prasanna@usc.edu |
| Pseudocode | Yes | Algorithm 1 Graph SAINT training algorithm |
| Open Source Code | Yes | We open source Graph SAINT . Open sourced code: https://github.com/Graph SAINT/Graph SAINT |
| Open Datasets | Yes | The Flickr dataset originates from NUS-wide . The SNAP website collected Flickr data from four different sources including NUS-wide, and generated an un-directed graph. (...) The Yelp dataset is prepared from the raw json data of businesses, users and reviews provided in the open challenge website . (...) For the Amazon dataset, a node is a product on the Amazon website and an edge (u, v) is created if products u and v are bought by the same customer. |
| Dataset Splits | Yes | All datasets follow fixed-partition splits. Appendix C.2 includes further details. |
| Hardware Specification | Yes | We run our experiments on a single machine with Dual Intel Xeon CPUs (E5-2698 v4 @ 2.2Ghz), one NVIDIA Tesla P100 GPU (16GB of HBM2 memory) and 512GB DDR4 memory. |
| Software Dependencies | Yes | The code is written in Python 3.6.8 (where the sampling part is written with Cython 0.29.2). We use Tensorflow 1.12.0 on CUDA 9.2 with CUDNN 7.2.1 to train the model on GPU. |
| Experiment Setup | Yes | For all baselines and datasets, we perform grid search on the hyperparameter space defined by: Hidden dimension: {128, 256, 512} Dropout: {0.0, 0.1, 0.2, 0.3} Learning rate: {0.1, 0.01, 0.001, 0.0001}. The hidden dimensions used for Table 2, Figure 2, Figure 3 and Figure 4 are: 512 for PPI, 256 for Flickr, 128 for Reddit, 512 for Yelp and 512 for Amazon. |