Train Once and Explain Everywhere: Pre-training Interpretable Graph Neural Networks

Authors: Jun Yin, Chaozhuo Li, Hao Yan, Jianxun Lian, Senzhang Wang

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments demonstrate that π-GNN significantly surpasses the leading interpretable GNN baselines with up to 9.98% interpretation improvement and 16.06% classification accuracy improvement. Meanwhile, π-GNN pre-trained on graph classification task also achieves the top-tier interpretation performance on node classification task, which further verifies its promising generalization performance among different downstream tasks. In this section, we conduct extensive experiments to evaluate the performance of π-GNN by answering the following two questions.
Researcher Affiliation Collaboration Jun Yin Central South University yinjun2000@csu.edu.cn Chaozhuo Li Microsoft Research Asia cli@microsoft.com Hao Yan Central South University CSUyh1999@csu.edu.cn Jianxun Lian Microsoft Research Asia jianxun.lian@microsoft.com Senzhang Wang Central South University szwang@csu.edu.cn
Pseudocode Yes Algorithm 1 Graph-Hypergraph Transformation
Open Source Code Yes Our code and datasets are available at https://anonymous.4open.science/r/PI-GNN-F86C
Open Datasets Yes Synthetic Datasets. BA-2Motifs [10] and Spurious-Motif [13] are two widely-used synthetic datasets to evaluate the interpretation performance of the GNN explanation methods. Real-world Datasets. We use four real-world datasets, the superpixel graph dataset MNIST-75sp [40], the sentiment analysis dataset Graph-SST2 [16], and two chemical molecule datasets Mutag [18] and Ogbg-Molhiv [47].
Dataset Splits Yes The PT-Motifs dataset is split into the training set of 50,000 graphs, the validation set of 10,000 graphs, and the testing set of 20,000 graphs.
Hardware Specification Yes All experiments are conducted on a single NVIDIA Ge Force 3090 GPU (24GB).
Software Dependencies No The paper does not provide specific version numbers for software dependencies (e.g., Python, PyTorch, or other libraries/frameworks) that were used to run the experiments, only general mentions of methods and models.
Experiment Setup Yes During the pre-training phase, the batchsize is set as {32, 64, 128, 256} and the learning rate is set as {10 3, 5 10 3, 10 4, 10 5, 10 6}. The pre-training epoch is set as {20, 40, 60, 80}. As shown in Tables 8 and 9, we present both the downstream predictor architecture and the fine-tuning details of the graph classification datasets and the node classification datasets, respectively.