Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
D4Explainer: In-distribution Explanations of Graph Neural Network via Discrete Denoising Diffusion
Authors: Jialin Chen, Shirley Wu, Abhijit Gupta, Rex Ying
NeurIPS 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical evaluations conducted on synthetic and real-world datasets provide compelling evidence of the state-of-the-art performance achieved by D4Explainer in terms of explanation accuracy, faithfulness, diversity, and robustness. 1 Empirical experiments on eight synthetic and real-world datasets show that D4Explainer achieves state-of-the-art performance in both counterfactual and model-level explanations |
| Researcher Affiliation | Academia | Jialin Chen Yale University EMAIL Shirley Wu Stanford University EMAIL Abhijit Gupta Yale University EMAIL Rex Ying Yale University EMAIL |
| Pseudocode | Yes | Algorithm 1 Reverse Sampling for Model-level Explanation |
| Open Source Code | Yes | 1The code is available at https://github.com/Graph-and-Geometric-Learning/D4Explainer |
| Open Datasets | Yes | We use four synthetic datasets: BA-shapes, Tree-Cycle, Tree-Grids, and BA-3Motif to evaluate the efficacy of the proposed D4Explainer . In the node-classification task, the graph consists of a base graph, which is randomly attached by different motifs, e.g., house, grid, cycle. We also test D4Explainer over real-world datasets, Cornell [52], Mutag [55, 56], BBBP [57] and NCI1 [58]. |
| Dataset Splits | No | The paper mentions using a 'test dataset' for evaluation and discusses metrics like CF-ACC and Fidelity over '10 different modification ratios from 0 to 0.3'. It also mentions 'test accuracy' for the target GNNs. However, specific percentages or counts for training, validation, and test splits used directly for reproducing *their* D4Explainer experiments are not explicitly provided. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU models, CPU types, or memory used for running the experiments. It only implies that models were trained and experiments were conducted. |
| Software Dependencies | No | In the implementation, we employ Adam [65] as our optimizer and Exponential LR [66] as the scheduler. However, specific version numbers for these or other software libraries (e.g., Python, PyTorch, TensorFlow) are not provided. |
| Experiment Setup | Yes | Table 7 shows the optimal numbers of hidden units, layers in PPGN, batch size, and the regularization coefficient α for each dataset. We run 1500 epochs and set the initial learning rate as 1 * 10^-3 across all datasets. |