FastGCN: Fast Learning with Graph Convolutional Networks via Importance Sampling

Authors: Jie Chen, Tengfei Ma, Cao Xiao

ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We show a comprehensive set of experiments to demonstrate its effectiveness compared with GCN and related models. In particular, training is orders of magnitude more efficient while predictions remain comparably accurate. and 4 EXPERIMENTS We follow the experiment setup in Kipf & Welling (2016a) and Hamilton et al. (2017) to demonstrate the effective use of Fast GCN, comparing with the original GCN model as well as Graph SAGE, on the following benchmark tasks: (1) classifying research topics using the Cora citation data set (Mc Callum et al., 2000); (2) categorizing academic papers with the Pubmed database; and (3) predicting the community structure of a social network modeled with Reddit posts.
Researcher Affiliation Industry Jie Chen , Tengfei Ma , Cao Xiao IBM Research chenjie@us.ibm.com, Tengfei.Ma1@ibm.com, cxiao@us.ibm.com
Pseudocode Yes Algorithm 1 Fast GCN batched training (one epoch) and Algorithm 2 Fast GCN batched training (one epoch), improved version
Open Source Code Yes For more details please check our codes in a temporary git repository https://github.com/matenure/Fast GCN.
Open Datasets Yes classifying research topics using the Cora citation data set (Mc Callum et al., 2000); (2) categorizing academic papers with the Pubmed database; and (3) predicting the community structure of a social network modeled with Reddit posts. These data sets are downloaded from the accompany websites of the aforementioned references. and The Cora and Pubmed data sets are from https://github.com/tkipf/gcn. and The Reddit data is from http://snap.stanford.edu/graphsage/.
Dataset Splits Yes Table 1: Dataset Statistics... Cora 2, 708 5, 429 7 1, 433 1, 208/500/1, 000 Pubmed 19, 717 44, 338 3 500 18, 217/500/1, 000 Reddit 232, 965 11, 606, 919 41 602 152, 410/23, 699/55, 334 and We adjusted the training/validation/test split of Cora and Pubmed to align with the supervised learning scenario.
Hardware Specification Yes Running time is compared on a single machine with 4-core 2.5 GHz Intel Core i7, and 16G RAM.
Software Dependencies No The paper mentions software like 'Py Torch' and 'Tensor Flow' but does not provide specific version numbers for these or other libraries/packages.
Experiment Setup Yes Implementation details are as following. All networks (including those under comparison) contain two layers as usual. [...] We preformed hyperparameter selection for the learning rate and model dimension. We swept learning rate in the set {0.01, 0.001, 0.0001}. The hidden dimension of Fast GCN for Reddit is set as 128, and for the other two data sets, it is 16. The batch size is 256 for Cora and Reddit, and 1024 for Pubmed. Dropout rate is set as 0. We use Adam as the optimization method for training.