Better with Less: A Data-Active Perspective on Pre-Training Graph Neural Networks
Authors: Jiarong Xu, Renhong Huang, XIN JIANG, Yuxuan Cao, Carl Yang, Chunping Wang, YANG YANG
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiment results show that the proposed APT is able to obtain an efficient pre-training model with fewer training data and better downstream performance. 4 Experiments |
| Researcher Affiliation | Collaboration | Jiarong Xu1 , Renhong Huang2, Xin Jiang3, Yuxuan Cao2 Carl Yang4, Chunping Wang5, Yang Yang2 1Fudan University, 2Zhejiang University, 3Lehigh University 4Emory University, 5Finvolution Group |
| Pseudocode | Yes | The overall algorithm for APT is given in Algorithm 1. |
| Open Source Code | Yes | We provide an open-source implementation of our model APT at https://github.com/galina0217/APT. |
| Open Datasets | Yes | The datasets for pre-training and testing, along with their statistics, are listed in Appendix D. Pre-training datasets are collected from different domains, including social, citation, and movie networks. The graph datasets for pre-training and testing in this paper are collected from a wide spectrum of domains (see Table 3 for an overview). ... from Open Graph Benchmark [10]. |
| Dataset Splits | No | The paper states 'For each dataset, we consistently use 90% of the data as the training set, and 10% as the testing set.' but does not explicitly mention a separate validation split or cross-validation setup. |
| Hardware Specification | Yes | We conduct all experiments on a single machine of Linux system with an Intel Xeon Gold 5118 (128G memory) and a Ge Force GTX Tesla P4 (8GB memory). |
| Software Dependencies | Yes | Our model is implemented under the following software settings: Pytorch version 1.4.0+cu100, CUDA version 10.0, networkx version 2.3, DGL version 0.4.3post2, sklearn version 0.20.3, numpy version 1.19.4, Python version 3.7.1. |
| Experiment Setup | Yes | In the training phase, we aim to utilize data from different domains to pre-train one graph model. We iteratively select graphs for pre-training until the predictive uncertainty of any candidate graph is below 3.5. For each selected graph, we choose samples with predictive uncertainty higher than 3. We set the number of subgraph instances queried in the graph for uncertainty estimation M as 500. The time-adaptive parameter γt in Eq. (4) follows a γt Beta(1, βt), where βt = 3 0.995t. We set the trade-off parameter λ = 10 for APT-L2, and λ = 500 for APT. The total iteration number is 100. We adopt GCC as the backbone pre-training model with its default hyper-parameters, including their subgraph instance definition. In the fine-tuning phase, we select logistic regression or SVM as the downstream classifier and adopt the same setting as GCC. |