reproducibilityindex.ai

Global Explainability of GNNs via Logic Combination of Learned Concepts

Authors: Steve Azzolin, Antonio Longa, Pietro Barbiero, Pietro Lio, Andrea Passerini

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conducted an experimental evaluation on synthetic and real-world datasets aimed at answering the following research questions:
Researcher Affiliation	Academia	1University of Trento, 2University of Cambridge, 3Fondazione Bruno Kessler
Pseudocode	No	The paper describes the proposed method step-by-step in narrative text and with mathematical formulas, but it does not include any clearly labeled 'Pseudocode' or 'Algorithm' block or figure.
Open Source Code	Yes	The source code of GLGExplainer, including the extraction of local explanations, as well as the datasets and all the code for reproducing the results is made freely available online1. 1https://github.com/steveazzolin/gnn_logic_global_expl
Open Datasets	Yes	The source code of GLGExplainer, including the extraction of local explanations, as well as the datasets and all the code for reproducing the results is made freely available online1. 1https://github.com/steveazzolin/gnn_logic_global_expl
Dataset Splits	Yes	Table 2: Mean and standard deviation for Fidelity, Accuracy, and Concept Purity computed over 5 runs with different random seeds. Since the Concept Purity is computed for every cluster independently, here we report mean and standard deviation across clusters over the best run according to the validation set.
Hardware Specification	No	The paper describes the model architectures (e.g., 2-layers GIN, 3-layers GCN) and training procedures, but it does not specify any particular hardware components (e.g., GPU models, CPU types, memory) used to run the experiments. It only mentions 'we trained our own networks'.
Software Dependencies	No	The paper mentions software components like 'PGExplainer', 'GIN', 'GATV2', 'ADAM optimizer', and 'PyTorch implementation'. However, it does not provide specific version numbers for these software dependencies, which are necessary for full reproducibility.
Experiment Setup	Yes	We set the number of prototypes m to 6, 2, and 4 for BAMulti Shapes, Mutagenicity, and HIN respectively (see Section 4.4 for an analysis showing how these numbers were inferred), keeping the dimensionality d to 10. We trained using ADAM optimizer with early stopping and with a learning rate of 1e-3 for the embedding and prototype learning components and a learning rate of 5e-4 for the E-LEN. The batch size was set to 128, the focusing parameter γ to 2, while the auxiliary loss coefficients λ1 and λ2 were set respectively to 0.09 and 0.00099. The E-LEN consists of an input Entropy Layer (R^m -> R^10), a hidden layer (R^10 -> R^5), and an output layer with Leaky ReLU activation function.