Contrastive and Generative Graph Convolutional Networks for Graph-based Semi-Supervised Learning
Authors: Sheng Wan, Shirui Pan, Jian Yang, Chen Gong10049-10057
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Intensive experimental results on a variety of real-world datasets firmly verify the effectiveness of our algorithm compared with other state-of-the-art methods. In experiments, we demonstrate the contributions of the supervision clues from utilizing contrastive learning and graph structure, and the superiority of our proposed CG3 to other state-of-the-art graph-based SSL methods has also been verified. Experimental Results To reveal the effectiveness of our proposed CG3 method, extensive experiments have been conducted on six benchmark datasets including three widely-used citation networks (i.e., Cora, Cite Seer, and Pub Med) (Sen et al. 2008; Bojchevski and G unnemann 2018), two Amazon product copurchase networks (i.e., Amazon Computers and Amazon Photo) (Shchur et al. 2018), and one co-author network subjected to computer science (i.e., Coauthor CS) (Shchur et al. 2018). Dataset statistics are summarized in Table 1. We report the mean accuracy of ten independent runs for every algorithm on each dataset to achieve fair comparison. Node Classification Results We evaluate the performance of our CG3 method on transductive semi-supervised node classification tasks by comparing it with a series of methods. Results under Scarce Labeled Training Data To further investigate the ability of our proposed CG3 in dealing with scarce supervision, we conduct experiments when the number of labeled examples is extremely small. Ablation Study As is mentioned in the introduction, our proposed CG3 employs the contrastive and graph generative losses to enrich the supervision signals from the data similarities and graph structure, respectively. To shed light on the contributions of these two components, we report the classification results of CG3 when each of the two components is removed on the three previously-used datasets including Cora, Cite Seer, and Pub Med. |
| Researcher Affiliation | Academia | 1 PCA Lab, Key Lab of Intelligent Perception and Systems for High-Dimensional Information of Ministry of Education, and Jiangsu Key Lab of Image and Video Understanding for Social Security, School of Computer Science and Engineering, Nanjing University of Science and Technology 2 Faculty of IT, Monash University, Australia 3 Department of Computing, Hong Kong Polytechnic University |
| Pseudocode | Yes | Algorithm 1 The Proposed CG3 algorithm Input: Feature matrix X; adjacency matrix A; label matrix Y; maximum number of iterations T 1: for t = 1 to T do 2: // Multi-view representation learning 3: Perform localized graph convolution (i.e., Eq. (1)) and hierarchical graph convolution (Hu et al. 2019) to obtain Hφ1 and Hφ2, respectively; 4: // Calculate loss values 5: Calculate semi-supervised contrastive loss Lssc based on Eqs. (3) and (6); 6: Calculate the graph generative loss Lg2 by Eq. (11); 7: Calculate the cross-entropy loss Lce with Eq. (13); 8: Update the network parameters according to the overall loss function L in Eq. (14); 9: end for 10: Conduct label prediction based on the trained network; Output: Predicted label for each unlabeled graph node. |
| Open Source Code | No | The paper does not contain an explicit statement about the release of its source code or a link to a code repository. |
| Open Datasets | Yes | extensive experiments have been conducted on six benchmark datasets including three widely-used citation networks (i.e., Cora, Cite Seer, and Pub Med) (Sen et al. 2008; Bojchevski and G unnemann 2018), two Amazon product copurchase networks (i.e., Amazon Computers and Amazon Photo) (Shchur et al. 2018), and one co-author network subjected to computer science (i.e., Coauthor CS) (Shchur et al. 2018). Dataset statistics are summarized in Table 1. |
| Dataset Splits | Yes | For the Cora, Cite Seer, and Pub Med datasets, we use the same train/validation/test splits as (Yang, Cohen, and Salakhudinov 2016). For the other three datasets (i.e., Amazon Computers, Amazon Photo, and Coauthor CS), we use 30 labeled nodes per class as the training set, 30 nodes per class as the validation set, and the rest as the test set. |
| Hardware Specification | No | The paper does not specify the hardware used for running the experiments (e.g., GPU models, CPU types, or memory). |
| Software Dependencies | No | The paper does not provide specific version numbers for any software dependencies (e.g., Python, PyTorch, TensorFlow, or other libraries/frameworks). |
| Experiment Setup | Yes | In this work, a two-layer GCN is employed with the input feature matrix X and adjacency matrix A, namely Hφ1 = ˆAσ( ˆAXW(0))W(1), where W(0) and W(1) denote the trainable weight matrices, σ( ) represents an activation function (e.g., the Re LU function (Nair and Hinton 2010)). Afterwards, we employ a simple yet effective hierarchical GCN model, i.e., HGCN (Hu et al. 2019), to generate the representations from the global view. the overall loss function of our CG3 can be presented as L = Lce + λssc Lssc + λg2Lg2, where λssc > 0 and λg2 > 0 are tuning parameters to weight the importance of Lssc and Lg2, respectively. |