Stochastic Training of Graph Convolutional Networks with Variance Reduction
Authors: Jianfei Chen, Jun Zhu, Le Song
ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical results show that our algorithms enjoy similar convergence rate and model quality with the exact algorithm using only two neighbors per node. The running time of our algorithms on a large Reddit dataset is only one seventh of previous neighbor sampling algorithms. We empirically test our algorithms on six graph datasets, and the results match with the theory. The experiments are done on a Titan X (Maxwell) GPU. |
| Researcher Affiliation | Collaboration | Jianfei Chen 1 Jun Zhu 1 Le Song 2 3 1Dept. of Comp. Sci. & Tech., BNRist Center, State Key Lab for Intell. Tech. & Sys., THBI Lab, Tsinghua University, Beijing, 100084, China 2Georgia Institute of Technology 3Ant Financial |
| Pseudocode | Yes | We have the pseudocode for the training in Appendix D. |
| Open Source Code | Yes | Our code is released at https: //github.com/thu-ml/stochastic_gcn. |
| Open Datasets | Yes | We examine the variance and convergence of our algorithms empirically on six datasets, including Citeseer, Cora, Pub Med and NELL from Kipf & Welling (2017) and Reddit, PPI from Hamilton et al. (2017a), with the same train / validation / test splits, as summarized in Table 1. |
| Dataset Splits | Yes | We examine the variance and convergence of our algorithms empirically on six datasets, including Citeseer, Cora, Pub Med and NELL from Kipf & Welling (2017) and Reddit, PPI from Hamilton et al. (2017a), with the same train / validation / test splits, as summarized in Table 1. |
| Hardware Specification | Yes | The experiments are done on a Titan X (Maxwell) GPU. |
| Software Dependencies | No | The paper mentions 'frameworks such as Tensor Flow (Abadi et al., 2016)' but does not specify a version number for TensorFlow or any other software dependency. |
| Experiment Setup | Yes | We set the dropout rate as zero and plot the training loss with respect to number of epochs as Fig. 2. All the four algorithms have similar low time complexity per epoch with D(l) = 2, while M1+PP takes D(l) = 20. |