reproducibilityindex.ai

Stochastic Training of Graph Convolutional Networks with Variance Reduction

Authors: Jianfei Chen, Jun Zhu, Le Song

ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirical results show that our algorithms enjoy similar convergence rate and model quality with the exact algorithm using only two neighbors per node. The running time of our algorithms on a large Reddit dataset is only one seventh of previous neighbor sampling algorithms. We empirically test our algorithms on six graph datasets, and the results match with the theory. The experiments are done on a Titan X (Maxwell) GPU.
Researcher Affiliation	Collaboration	Jianfei Chen 1 Jun Zhu 1 Le Song 2 3 1Dept. of Comp. Sci. & Tech., BNRist Center, State Key Lab for Intell. Tech. & Sys., THBI Lab, Tsinghua University, Beijing, 100084, China 2Georgia Institute of Technology 3Ant Financial
Pseudocode	Yes	We have the pseudocode for the training in Appendix D.
Open Source Code	Yes	Our code is released at https: //github.com/thu-ml/stochastic_gcn.
Open Datasets	Yes	We examine the variance and convergence of our algorithms empirically on six datasets, including Citeseer, Cora, Pub Med and NELL from Kipf & Welling (2017) and Reddit, PPI from Hamilton et al. (2017a), with the same train / validation / test splits, as summarized in Table 1.
Dataset Splits	Yes	We examine the variance and convergence of our algorithms empirically on six datasets, including Citeseer, Cora, Pub Med and NELL from Kipf & Welling (2017) and Reddit, PPI from Hamilton et al. (2017a), with the same train / validation / test splits, as summarized in Table 1.
Hardware Specification	Yes	The experiments are done on a Titan X (Maxwell) GPU.
Software Dependencies	No	The paper mentions 'frameworks such as Tensor Flow (Abadi et al., 2016)' but does not specify a version number for TensorFlow or any other software dependency.
Experiment Setup	Yes	We set the dropout rate as zero and plot the training loss with respect to number of epochs as Fig. 2. All the four algorithms have similar low time complexity per epoch with D(l) = 2, while M1+PP takes D(l) = 20.