reproducibilityindex.ai

Train Once and Explain Everywhere: Pre-training Interpretable Graph Neural Networks

Authors: Jun Yin, Chaozhuo Li, Hao Yan, Jianxun Lian, Senzhang Wang

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments demonstrate that π-GNN significantly surpasses the leading interpretable GNN baselines with up to 9.98% interpretation improvement and 16.06% classification accuracy improvement. Meanwhile, π-GNN pre-trained on graph classification task also achieves the top-tier interpretation performance on node classification task, which further verifies its promising generalization performance among different downstream tasks. In this section, we conduct extensive experiments to evaluate the performance of π-GNN by answering the following two questions.
Researcher Affiliation	Collaboration	Jun Yin Central South University yinjun2000@csu.edu.cn Chaozhuo Li Microsoft Research Asia cli@microsoft.com Hao Yan Central South University CSUyh1999@csu.edu.cn Jianxun Lian Microsoft Research Asia jianxun.lian@microsoft.com Senzhang Wang Central South University szwang@csu.edu.cn
Pseudocode	Yes	Algorithm 1 Graph-Hypergraph Transformation
Open Source Code	Yes	Our code and datasets are available at https://anonymous.4open.science/r/PI-GNN-F86C
Open Datasets	Yes	Synthetic Datasets. BA-2Motifs [10] and Spurious-Motif [13] are two widely-used synthetic datasets to evaluate the interpretation performance of the GNN explanation methods. Real-world Datasets. We use four real-world datasets, the superpixel graph dataset MNIST-75sp [40], the sentiment analysis dataset Graph-SST2 [16], and two chemical molecule datasets Mutag [18] and Ogbg-Molhiv [47].
Dataset Splits	Yes	The PT-Motifs dataset is split into the training set of 50,000 graphs, the validation set of 10,000 graphs, and the testing set of 20,000 graphs.
Hardware Specification	Yes	All experiments are conducted on a single NVIDIA Ge Force 3090 GPU (24GB).
Software Dependencies	No	The paper does not provide specific version numbers for software dependencies (e.g., Python, PyTorch, or other libraries/frameworks) that were used to run the experiments, only general mentions of methods and models.
Experiment Setup	Yes	During the pre-training phase, the batchsize is set as {32, 64, 128, 256} and the learning rate is set as {10 3, 5 10 3, 10 4, 10 5, 10 6}. The pre-training epoch is set as {20, 40, 60, 80}. As shown in Tables 8 and 9, we present both the downstream predictor architecture and the fine-tuning details of the graph classification datasets and the node classification datasets, respectively.