Learning to Pre-train Graph Neural Networks
Authors: Yuanfu Lu, Xunqiang Jiang, Yuan Fang, Chuan Shi4276-4284
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we conduct a systematic empirical study on the pre-training of various GNN models, using both a public collection of protein graphs and a new compilation of bibliographic graphs for pre-training. Experimental results show that L2P-GNN is capable of learning effective and transferable prior knowledge that yields powerful representations for downstream tasks. |
| Researcher Affiliation | Collaboration | Yuanfu Lu1, 2 , Xunqiang Jiang1, Yuan Fang3, Chuan Shi1, 4 1Beijing University of Posts and Telecommunications 2We Chat Search Application Department, Tencent Inc. China 3Singapore Management University 4Peng Cheng Laboratory, Shenzhen, China |
| Pseudocode | Yes | Detailed pseudocode of the algorithm is in supplemental material, Appendix . |
| Open Source Code | Yes | (Code and datasets are available at https://github.com/rootlu/L2P-GNN.) |
| Open Datasets | Yes | The biology graphs come from a public repository1, covering 394,925 protein subgraphs (Marinka et al. 2019). We further present a new collection of bibliographic graphs called Pre DBLP, purposely compiled for pre-training GNNs based on DBLP2, which contains 1,054,309 paper subgraphs in 31 fields. 1http://snap.stanford.edu/gnn-pretrain 2https://dblp.uni-trier.de The new bibliographic dataset is publicly released |
| Dataset Splits | Yes | For both domains, we split downstream data with 8:1:1 ratio for train/validation/test sets. |
| Hardware Specification | No | The paper states "The hyper-parameter settings and experimental environment are discussed in Appendix." but does not provide specific hardware details in the main text. |
| Software Dependencies | No | The paper mentions that "Implementation details are presented in Appendix." but does not provide specific software names with version numbers in the main text. |
| Experiment Setup | Yes | Lastly, we investigate the effect of the number of nodeand graph-level adaptation steps (s, t), as well as the dimension of node representations. We plot the performance of L2P-GNN under combinations of 0 s 3 and 0 t 3 in Fig. 3(b). We find that L2P-GNN is robust to different values of s and t, except when one or both of them are zero (i.e., no adaptation at all). In particular, L2P-GNN can adapt quickly with only one gradient update in both adaptions (i.e., s = t = 1). Finally, we summarize the impact of the dimension in Fig. 3(c). We observe that L2P-GNN achieves the optimal performance when the dimension is 300 and is generally stable around the optimal setting, indicating that L2P-GNN is robust w.r.t. the representation dimension. |