Towards Robust Fidelity for Evaluating Explainability of Graph Neural Networks

Authors: Xu Zheng, Farhad Shirani, Tianchun Wang, Wei Cheng, Zhuomin Chen, Haifeng Chen, Hua Wei, Dongsheng Luo

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive empirical analysis on both synthetic and real datasets are provided to illustrate that the proposed metrics are more coherent with gold standard metrics. In this section, we empirically verify the effectiveness of the generalized class of surrogate fidelity measures. We also conduct extensive studies to verify our theoretical claims.
Researcher Affiliation Collaboration Xu Zheng1 , Farhad Shirani1 , Tianchun Wang2, Wei Cheng3, Zhuomin Chen1, Haifeng Chen3, Hua Wei4, Dongsheng Luo1 1School of Computing and Information Sciences, Florida International University, US 2College Information Sciences and Technology, The Pennsylvania State University, US 3NEC Laboratories America, US 4School of Computing and Augmented Intelligence, Arizona State University, US {xzhen019,fshirani,zchen051,dluo}@fiu.edu tkw5356@psu.edu {weicheng,haifeng}@nec-labs.com hua.wei@asu.edu
Pseudocode Yes Algorithm 1 Computating Fidα1,+ and Algorithm 2 Computating Fidα2,
Open Source Code Yes The source code is available at https://trustai4s-lab.github.io/fidelity.
Open Datasets Yes Four benchmark datasets with ground truth explanations are used for evaluation, with Tree-Circles, and Tree-Grid (Ying et al., 2019) for the node classification task, and BA-2motifs (Luo et al., 2020) and MUTAG (Debnath et al., 1991) for graph classification.
Dataset Splits Yes For each dataset, we follow existing works (Luo et al., 2020; Ying et al., 2019) to split train/validation/test with 8 : 1 : 1 for all datasets.
Hardware Specification Yes We use a Linux machine with 8 NVIDIA A100 GPUs as the hardware platform, each with 40GB of memory.
Software Dependencies Yes The software environment is with CUDA 11.3, Python 3.7.16, and Pytorch 1.12.1.
Experiment Setup Yes The models are trained with the Adam optimizer with an initial learning rate of 1.0 10 3. We use the GCN model and vary these two hyper-parameters in the range [0.1, 0.3, 0.5, 0.7, 0.9].