Graph Stochastic Neural Networks for Semi-supervised Learning
Authors: Haibo Wang, Chuan Zhou, Xin Chen, Jia Wu, Shirui Pan, Jilong Wang
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on three real-world datasets show that GSNN achieves substantial performance gain in different scenarios compared with state-of-the-art baselines. |
| Researcher Affiliation | Academia | Haibo Wang1, 2, , Chuan Zhou3, 4, Xin Chen2, , Jia Wu5, Shirui Pan6, Jilong Wang2 1Department of Computer Science and Technology, Tsinghua University, Beijing, China 2Institute for Network Sciences and Cyberspace, Tsinghua University, Beijing, China 3Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China 4School of Cyber Security, University of Chinese Academy of Sciences, Beijing, China 5Faculty of Science and Engineering, Macquarie University, Sydney, Australia 6Faculty of Information Technology, Monash University, Melbourne, Australia |
| Pseudocode | Yes | The pseudo-code of the algorithm is provided in the supplemental material. |
| Open Source Code | Yes | Our reproducible code is available at https://github.com/GSNN/GSNN. |
| Open Datasets | Yes | We conduct experiments on three commonly used benchmark datasets: Cora, Citeseer and Pubmed [25, 28] |
| Dataset Splits | Yes | Specifically, in each dataset, 20 nodes per class are used for training, 1000 nodes are used for evaluation and another 500 nodes are used for validation and early-stopping. |
| Hardware Specification | No | The paper does not provide any specific hardware details such as GPU/CPU models or memory used for running the experiments. |
| Software Dependencies | No | The paper mentions using the Adam optimizer but does not specify versions for any software dependencies like programming languages or libraries. |
| Experiment Setup | Yes | For our proposed two models (i.e., GSNN-M and GSNN-A), in qnet1 and pnet, we employ two information aggregation layers, and other settings related to hidden layers are consistent with GCN [3] and GAT [5] respectively. For example, the number of hidden units for GSNN-M is set to 16 and that for GSNN-A is set to 64. Besides, GSNN-A also employs the multi-head attention mechanism in the first hidden layer with 8 attention heads. For both GSNN-M and GSNN-A, the dimension of the hidden variable z is set to 16. In qnet2, we first employ a two-layer MLP to generate the representation rv for each node v, whose dimension is 16. After that, we summarize all representations into a vector and use two fully-connected networks to convert it into the mean and covariance matrix for the multivariate Gaussian distribution. As mentioned in Section 3.3, both the numbers of sampled instances of YU and z are set to 1 for efficiency purpose. We use the Adam optimizer [30] during training, with the learning rate as 0.01 and weight decay as 5 10 4, and set the epoch number as 200. During the inference phase, the sampling number L in Eq. (11) is set to 40. |