Self-supervised Heterogeneous Graph Pre-training Based on Structural Clustering
Authors: Yaming Yang, Ziyu Guan, Zhe Wang, Wei Zhao, Cai Xu, Weigang Lu, Jianbin Huang
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on four real-world datasets demonstrate the superior effectiveness of SHGP against state-of-the-art unsupervised baselines and even semi-supervised baselines. In this section, we verify the generalization ability of the proposed SHGP by transferring the pre-trained object embeddings to various downstream tasks including object classification, object clustering, and embedding visualization. |
| Researcher Affiliation | Academia | Yaming Yang, Ziyu Guan, Zhe Wang, Wei Zhao , Cai Xu, Weigang Lu, Jianbin Huang School of Computer Science and Technology, Xidian University {yym@, zyguan@, zwang@stu., ywzhao@mail., cxu@, wglu@stu., jbhuang@}xidian.edu.cn |
| Pseudocode | Yes | Algorithm 1 The overall procedure of SHGP |
| Open Source Code | Yes | We release our source code at: https://github.com/kepsail/SHGP. |
| Open Datasets | Yes | In the experiments, we use four publicly available HIN benchmark datasets, which are widely used in previous related works [38, 32, 18, 23, 33]. Their statistics are summarized in Table 1. Please see Appendix A.1 for more details of these datasets. |
| Dataset Splits | Yes | On each dataset, for the objects that have ground-truth labels, we randomly select {4%, 6%, 8%} objects as the training set. The others are divided equally as the validation set and the test set. |
| Hardware Specification | Yes | All the experiments are conducted on an NVIDIA GTX 1080Ti GPU. |
| Software Dependencies | No | The paper mentions software components like 'Adam optimizer' and 'Xavier uniform distribution' but does not provide specific version numbers for any libraries, frameworks, or programming languages used (e.g., Python, PyTorch/TensorFlow versions). |
| Experiment Setup | Yes | For the proposed SHGP, in all the experiments, we use two HGCN layers as the Att-HGNN encoder, and search the dimensionalities of the hidden layers in the set {64, 128, 256, 512}. All the model parameters are initialized by the Xavier uniform distribution [6], and they are optimized through the Adam optimizer. The learning rate and weight decay are searched from 1e-4 to 1e-2. For the number of warm-up epochs, we search its best value in the set {5, 10, 20, 30, 40, 50}. |