Collaborative Graph Convolutional Networks: Unsupervised Learning Meets Semi-Supervised Learning

Authors: Binyuan Hui, Pengfei Zhu, Qinghua Hu4215-4222

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on benchmark graph datasets validate the superiority of our proposed GMM-VGAE compared with the state-of-the-art attributed graph clustering networks. The performance of node classification is greatly improved by our proposed CGCN, which verifies graph-based unsupervised learning can be well exploited to enhance the performance of semisupervised learning.
Researcher Affiliation Academia Binyuan Hui, Pengfei Zhu, Qinghua Hu College of Intelligence and Computing Tianjin University, Tianjin, China {huibinyuan, zhupengfei, huqinghua}@tju.edu.cn
Pseudocode Yes Algorithm 1: CGCN
Open Source Code No The paper does not provide concrete access to source code for the methodology.
Open Datasets Yes Cora, Citeseer and Pubmed (Sen et al. 2008) are citation networks where the number of nodes varies from 2708 to 19717 and the number of feature varies from 500 to 3703.
Dataset Splits No For each run, we split the data into one small sample subset for training, and the test sample subset with 1000 samples. The paper explicitly mentions a test split but does not specify details for a separate validation split, only referring to baselines using validation.
Hardware Specification No The paper does not provide specific hardware details used for running its experiments.
Software Dependencies No The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment.
Experiment Setup Yes We train GMM-VGAE module with Adam learning algorithm (the learning rate is set as 0.01) for all datasets. We construct encoder using a two-layer GCN with 32 and 16 filters respectively, and initialize encoder weights as described in (Glorot and Bengio 2010). [...] We set the number of pretrain iterations Tp as 200, the number of retrain iterations Tr as 20 and the number of query high confidence nodes q as 20 for each pseudo-label assignment with T = 5 times. Following (Kipf and Welling 2017), we set the learning rate, dropout rate, regularization weight, and the size of second hidden layer as 0.01, 0.2, 0.5 10 4 and 16, respectively.