Blocking-based Neighbor Sampling for Large-scale Graph Neural Networks
Authors: Kai-Lang Yao, Wu-Jun Li
IJCAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on three benchmark datasets show that, on large-scale graphs, BNS is 2 5 faster than state-of-the-art methods when achieving the same accuracy. Moreover, even on the small-scale graphs, BNS also demonstrates the advantage of low time cost. 4 Experiments In this section, we compare BNS with other baselines on five node-classification datasets. |
| Researcher Affiliation | Academia | Kai-Lang Yao and Wu-Jun Li National Key Laboratory for Novel Software Technology Collaborative Innovation Center of Novel Software Technology and Industrialization Department of Computer Science and Technology, Nanjing University, China yaokl@lamda.nju.edu.cn, liwujun@nju.edu.cn |
| Pseudocode | Yes | Algorithm 1 Sampling Algorithm |
| Open Source Code | No | The paper does not provide an explicit statement about releasing source code for the described methodology or a link to a code repository. The only link provided is for an Appendix containing a proof: 'The Appendix can be found in https://cs.nju.edu.cn/lwj/'. |
| Open Datasets | Yes | Ogbn-products, ogbn-papers100M and ogbn-proteins are publicly available [Hu et al., 2020]. |
| Dataset Splits | No | The paper does not explicitly provide specific percentages, counts, or detailed methodologies for training, validation, and test splits. It refers to general dataset usage and evaluation criteria but lacks the concrete split information for reproducibility. |
| Hardware Specification | Yes | All experiments are run on a NVIDIA Titan XP GPU server with 12 GB graphics memory. |
| Software Dependencies | No | The paper mentions 'Pytorch platform', 'Pytorch-Geometric Library', and 'Adam' optimizer. However, it does not specify version numbers for any of these software dependencies. |
| Experiment Setup | Yes | Empirically, r is set to 128 on all datasets, L is set to 5 on both ogbn-proteins and ogbn-products, and L is set to 3 on ogbn-papers100M. For T, it is set to 100 on both ogbn-products and ogbn-papers100M, and set to 1,000 on ogbn-proteins. For λ and p, the values of them are obtained by tuning with NS on the benchmark datasets. On ogbn-product, λ = 5 10 6 and p = 0.1. On ogbn-papers100M, λ = 5 10 7 and p = 0.1. On ogbnproteins, λ = 0 and p = 0. In BNS, we set ρ to 0.5 for convenience and do not tune it. Adam [Kingma and Ba, 2015] is used to optimize the model and the learning rate η is set to 0.01. |