SlotGAT: Slot-based Message Passing for Heterogeneous Graphs

Authors: Ziang Zhou, Jieming Shi, Renchi Yang, Yuanhang Zou, Qing Li

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The superiority of Slot GAT is evaluated against 13 baselines on 6 datasets for node classification and link prediction.
Researcher Affiliation Collaboration 1Department of Computing, The Hong Kong Polytechnique Univesity 2Department of Computer Science, Hong Kong Baptist University 3Tencent.
Pseudocode Yes Algorithm 1 shows the pseudo code of Slot GAT.
Open Source Code Yes Our code is at https://github.com/ scottjiao/Slot GAT_ICML23/.
Open Datasets Yes Table 1 reports the statistics of benchmark datasets widely used in (Lv et al., 2021; Wang et al., 2019d; Zhao et al., 2022; Zhang et al., 2019; Yun et al., 2019; Yang et al., 2022). The descriptions of all datasets are in Appendix A.1. For all of the benchmark datasets, one could access them in online platform HGB1. Footnote 1 refers to: https://www.biendata.xyz/hgb/
Dataset Splits Yes For node classification, following (Lv et al., 2021), we split labeled training set into training and validation with ratio 80% : 20%, while the testing data are fixed with detailed numbers in Appendix A.2 Table 12. For link prediction, we adopt ratio 81% : 9% : 10% to divide the edges into training, validation, and testing.
Hardware Specification Yes All experiments are conducted on a machine powered by an Intel(R) Xeon(R) E5-2603 v4 @ 1.70GHz CPU, 131GB RAM, and a Nvidia Geforce 3090 Cards with Cuda version 11.3.
Software Dependencies Yes All experiments are conducted on a machine powered by an Intel(R) Xeon(R) E5-2603 v4 @ 1.70GHz CPU, 131GB RAM, and a Nvidia Geforce 3090 Cards with Cuda version 11.3.
Experiment Setup Yes Hyper-parameter Search Space. We search learning rate within {1, 5} {1e 5, 1e 4, 1e 3, 1e 2}, weight decay rate within {1, 5} {1e 5, 1e 4, 1e 3}, dropout rate for features within {0.2, 0.5, 0.8, 0.9}, dropout rate for connections within {0, 0.2, 0.5, 0.8, 0.9}, and number of hidden layers L within {2, 3, 4, 5, 6}. We use the same dimension of hidden embeddings across all layers dl within {32, 64, 128}. We search the number of epochs within the range of {40, 300, 1000} with early stopping patience 40, and dimension ds of slot attention vector within the range of {3, 8, 32, 64}. Following (Lv et al., 2021), for input feature type, we use feat = 0 to denote the use of all given features, feat = 1 to denote using only target node features (zero vector for others), and feat = 2 to denote all nodes with one-hot features. For node classification, we use feat 1 and set the number of attention heads K to be 8. For link prediction, we use feat 2 and set K to be 2.