reproducibilityindex.ai

Bandit Samplers for Training Graph Neural Networks

Authors: Ziqi Liu, Zhengwei Wu, Zhiqiang Zhang, Jun Zhou, Shuang Yang, Le Song, Yuan Qi

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We theoretically show that our algorithm asymptotically approaches the optimal variance within a factor of 3. We show the efﬁciency and effectiveness of our approach on multiple datasets. and We empirically show that our approachs are way competitive in terms of convergence and sample variance, compared with state-of-the-art approaches on multiple public datasets.
Researcher Affiliation	Collaboration	Ziqi Liu Ant Group ziqiliu@antfin.com Zhengwei Wu Ant Group zejun.wzw@antfin.com Zhiqiang Zhang Ant Group lingyao.zzq@antfin.com Jun Zhou Ant Group jun.zhoujun@antfin.com Shuang Yang Alibaba Group shuang.yang@antfin.com Le Song Ant Group Georgia Institute of Technology lsong@cc.gatech.edu Yuan Qi Ant Group yuan.qi@antfin.com
Pseudocode	Yes	Algorithm 1 Bandit Samplers for Training GNNs.
Open Source Code	Yes	Please ﬁnd our implementations at https://github.com/xavierzw/gnn-bs.
Open Datasets	Yes	We report results on 5 benchmark data that include Cora [18], Pubmed [18], PPI [11], Reddit [11], and Flickr [22].
Dataset Splits	Yes	We follow the standard data splits, and summarize the statistics in Table 1. By following the exsiting implementations3, we save the model based on the best results on validation, and restore the model to report results on testing data in Section 7.1.
Hardware Specification	Yes	We run all the experiments using one machine with Intel Xeon E5-2682 and 512GB RAM.
Software Dependencies	No	No specific software versions (e.g., Python 3.8, PyTorch 1.9) were mentioned for reproducibility.
Experiment Setup	Yes	We ﬁx the number of layers as 2 as in [13] for all comparison algorithms. We set the dimension of hidden embeddings as 16 for Cora and Pubmed, and 256 for PPI, Reddit and Flickr. For a fair comparison, we do not use the normalization layer [2] particularly used in some works [5, 22]. For attentive GNNs, we use the attention layer proposed in GAT. we set the number of multi-heads as 1 for simplicity. We do grid search for the following hyperparameters in each algorithm, i.e., the learning rate {0.01, 0.001}, the penalty weight on the ℓ2-norm regularizers {0, 0.0001, 0.0005, 0.001}, the dropout rate {0, 0.1, 0.2, 0.3}. For the sample size in Graph SAGE, S-GCN and our algorithms, we set 1 for Cora and Pubmed, 5 for Flickr, 10 for PPI and reddit.