GNNX-BENCH: Unravelling the Utility of Perturbation-based GNN Explainers through In-depth Benchmarking
Authors: Mert Kosan, Samidha Verma, Burouj Armgaan, Khushbu Pahwa, Ambuj Singh, Sourav Medya, Sayan Ranu
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Motivated by this need, we present a benchmarking study on perturbation-based explainability methods for GNNs, aiming to systematically evaluate and compare a wide range of explainability techniques. Among the key findings of our study, we identify the Pareto-optimal methods that exhibit superior efficacy and stability in the presence of noise. Overall, this benchmarking study empowers stakeholders in the field of GNNs with a comprehensive understanding of the state-of-the-art explainability methods, potential research problems for further enhancement, and the implications of their application in real-world scenarios. |
| Researcher Affiliation | Academia | University of California, Santa Barbara1 Indian Institute of Technology, Delhi2 Rice University3 University of Illinois, Chicago4 |
| Pseudocode | No | The paper does not contain any sections explicitly labeled "Pseudocode" or "Algorithm" with structured steps. |
| Open Source Code | Yes | Codebase: As a by-product, a meticulously curated, publicly accessible code base is provided (https://github.com/idea-iitd/gnn-x-bench/). |
| Open Datasets | Yes | Datasets: Table 4 showcases the principal statistical characteristics of each dataset employed in our experiments, along with the corresponding tasks evaluated on them. The TREE-CYCLES, TREEGRID, and BA-SHAPES datasets serve as benchmark graph datasets for counterfactual analysis. These datasets incorporate ground-truth explanations Tan et al. (2022); Lin et al. (2021a); Lucic et al. (2022). |
| Dataset Splits | Yes | The datasets for factual and counterfactual explainers follow an 80:10:10 split for training, validation and testing. ... The train, validation, and test datasets are divided into an 80:10:10 ratio. |
| Hardware Specification | Yes | All experiments were conducted using the Ubuntu 18.04 operating system on an NVIDIA DGX Station equipped with four V100 GPU cards, each having 128GB of GPU memory. The system also included 256GB of RAM and a 20-core Intel Xeon E5-2698 v4 2.2 GHz CPU. |
| Software Dependencies | No | All experiments were conducted using the Ubuntu 18.04 operating system on an NVIDIA DGX Station... The paper mentions the operating system version but does not provide specific version numbers for key software components like programming languages or libraries (e.g., Python, PyTorch, TensorFlow, CUDA versions). |
| Experiment Setup | Yes | The dropout rate, learning rate, and batch size are set to 0, 0.001, and 128, respectively. The algorithms run for 1000 epochs with early stopping after 200 patience steps on the validation set. ... Each model has 3 graph convolutional layers with 20 hidden dimensions for the benchmark datasets. The non-linearity used is relu for the first two layers and log softmax after the last layer of GCN. The learning rate is 0.01. The train and test data are divided in the ratio 80:20. |