Self-Supervised Heterogeneous Graph Learning: a Homophily and Heterogeneity View

Authors: Yujie Mo, Feiping Nie, Ping Hu, Heng Tao Shen, Zheng Zhang, Xinchao Wang, Xiaofeng Zhu

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 3 EXPERIMENTS In this section, we conduct experiments on both heterogeneous and homogeneous graph datasets to evaluate the proposed HERO in terms of different downstream tasks (i.e., node classification and similarity search), compared to heterogeneous and homogeneous graph methods. Detailed settings are shown in Appendix D. Additional experimental results are shown in Appendix E.
Researcher Affiliation Academia 1School of Computer Science and Engineering, University of Electronic Science and Technology of China 2National University of Singapore 3 School of Artificial Intelligence, Optics and Electronics (i OPEN), Northwestern Polytechnical University 4School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen
Pseudocode Yes Algorithm 1 The pseudo-code of the proposed method.
Open Source Code Yes The code is released at https://github.com/Yujie Mo/HERO.
Open Datasets Yes The used datasets include five heterogeneous graph datasets and four homogeneous graph datasets. Heterogeneous graph datasets include three academic datasets (i.e., ACM (Wang et al., 2019), DBLP (Wang et al., 2019), and Aminer (Hu et al., 2019)), one business dataset (i.e., Yelp (Lu et al., 2019)), and one huge knowledge graph dataset (i.e., Freebase (Lv et al., 2021)). Homogeneous graph datasets include two sale datasets (i.e., Amazon-Photo and Amazon-Computers (Shchur et al., 2018)), and two co-authorship datasets (i.e., Coauther-CS and Coauther-Physics (Sinha et al., 2015)).
Dataset Splits No Table 3 provides '#Training' and '#Test' node counts for datasets but does not explicitly mention or detail a 'validation' split. The text also does not specify precise splitting methodologies for training, validation, and testing sets.
Hardware Specification No The paper does not provide specific details on the hardware used for running experiments, such as GPU models, CPU types, or memory specifications.
Software Dependencies No The paper mentions general software components like 'Adam optimizer' and 'ReLU activation' but does not specify version numbers for programming languages, libraries, or frameworks used (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes Detailed settings are shown in Appendix D. ... In the proposed method, all parameters were optimized by the Adam optimizer (Kingma & Ba, 2015) with an initial learning rate. Moreover, We use early stopping with a patience of 30 to train the proposed SHGL model. ... In the proposed method, we employ the non-negative parameters (i.e., γ, η, and λ) to achieve a trade-off between each term of the consistency loss, specificity loss, and the final objective function. To investigate the impact of γ, η, and λ with different settings, we conduct the node classification on the ACM datasets by varying the value of parameters in the range of [10 3,103] and reporting the results in Figure 7.