reproducibilityindex.ai

Towards Robust Fidelity for Evaluating Explainability of Graph Neural Networks

Authors: Xu Zheng, Farhad Shirani, Tianchun Wang, Wei Cheng, Zhuomin Chen, Haifeng Chen, Hua Wei, Dongsheng Luo

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive empirical analysis on both synthetic and real datasets are provided to illustrate that the proposed metrics are more coherent with gold standard metrics. In this section, we empirically verify the effectiveness of the generalized class of surrogate fidelity measures. We also conduct extensive studies to verify our theoretical claims.
Researcher Affiliation	Collaboration	Xu Zheng1 , Farhad Shirani1 , Tianchun Wang2, Wei Cheng3, Zhuomin Chen1, Haifeng Chen3, Hua Wei4, Dongsheng Luo1 1School of Computing and Information Sciences, Florida International University, US 2College Information Sciences and Technology, The Pennsylvania State University, US 3NEC Laboratories America, US 4School of Computing and Augmented Intelligence, Arizona State University, US {xzhen019,fshirani,zchen051,dluo}@fiu.edu tkw5356@psu.edu {weicheng,haifeng}@nec-labs.com hua.wei@asu.edu
Pseudocode	Yes	Algorithm 1 Computating Fidα1,+ and Algorithm 2 Computating Fidα2,
Open Source Code	Yes	The source code is available at https://trustai4s-lab.github.io/fidelity.
Open Datasets	Yes	Four benchmark datasets with ground truth explanations are used for evaluation, with Tree-Circles, and Tree-Grid (Ying et al., 2019) for the node classification task, and BA-2motifs (Luo et al., 2020) and MUTAG (Debnath et al., 1991) for graph classification.
Dataset Splits	Yes	For each dataset, we follow existing works (Luo et al., 2020; Ying et al., 2019) to split train/validation/test with 8 : 1 : 1 for all datasets.
Hardware Specification	Yes	We use a Linux machine with 8 NVIDIA A100 GPUs as the hardware platform, each with 40GB of memory.
Software Dependencies	Yes	The software environment is with CUDA 11.3, Python 3.7.16, and Pytorch 1.12.1.
Experiment Setup	Yes	The models are trained with the Adam optimizer with an initial learning rate of 1.0 10 3. We use the GCN model and vary these two hyper-parameters in the range [0.1, 0.3, 0.5, 0.7, 0.9].