Blocking-based Neighbor Sampling for Large-scale Graph Neural Networks

Authors: Kai-Lang Yao, Wu-Jun Li

IJCAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on three benchmark datasets show that, on large-scale graphs, BNS is 2 5 faster than state-of-the-art methods when achieving the same accuracy. Moreover, even on the small-scale graphs, BNS also demonstrates the advantage of low time cost. 4 Experiments In this section, we compare BNS with other baselines on five node-classification datasets.
Researcher Affiliation Academia Kai-Lang Yao and Wu-Jun Li National Key Laboratory for Novel Software Technology Collaborative Innovation Center of Novel Software Technology and Industrialization Department of Computer Science and Technology, Nanjing University, China yaokl@lamda.nju.edu.cn, liwujun@nju.edu.cn
Pseudocode Yes Algorithm 1 Sampling Algorithm
Open Source Code No The paper does not provide an explicit statement about releasing source code for the described methodology or a link to a code repository. The only link provided is for an Appendix containing a proof: 'The Appendix can be found in https://cs.nju.edu.cn/lwj/'.
Open Datasets Yes Ogbn-products, ogbn-papers100M and ogbn-proteins are publicly available [Hu et al., 2020].
Dataset Splits No The paper does not explicitly provide specific percentages, counts, or detailed methodologies for training, validation, and test splits. It refers to general dataset usage and evaluation criteria but lacks the concrete split information for reproducibility.
Hardware Specification Yes All experiments are run on a NVIDIA Titan XP GPU server with 12 GB graphics memory.
Software Dependencies No The paper mentions 'Pytorch platform', 'Pytorch-Geometric Library', and 'Adam' optimizer. However, it does not specify version numbers for any of these software dependencies.
Experiment Setup Yes Empirically, r is set to 128 on all datasets, L is set to 5 on both ogbn-proteins and ogbn-products, and L is set to 3 on ogbn-papers100M. For T, it is set to 100 on both ogbn-products and ogbn-papers100M, and set to 1,000 on ogbn-proteins. For λ and p, the values of them are obtained by tuning with NS on the benchmark datasets. On ogbn-product, λ = 5 10 6 and p = 0.1. On ogbn-papers100M, λ = 5 10 7 and p = 0.1. On ogbnproteins, λ = 0 and p = 0. In BNS, we set ρ to 0.5 for convenience and do not tune it. Adam [Kingma and Ba, 2015] is used to optimize the model and the learning rate η is set to 0.01.