Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

MAGE: Model-Level Graph Neural Networks Explanations via Motif-based Graph Generation

Authors: Zhaoning Yu, Hongyang Gao

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our proposed method s effectiveness is demonstrated through quantitative and qualitative assessments conducted on six real-world molecular datasets. The implementation of our method can be found at https://github.com/ZhaoningYu1996/MAGE.
Researcher Affiliation Academia Zhaoning Yu Department of Computer Science Iowa State University Ames, IA 50010, USA EMAIL Hongyang Gao Department of Computer Science Iowa State University Ames, IA 50010, USA EMAIL
Pseudocode Yes Algorithm 1 Decoding a Molecule from a Tree Structure
Open Source Code Yes The implementation of our method can be found at https://github.com/ZhaoningYu1996/MAGE.
Open Datasets Yes We evaluate the proposed methods using molecule classification tasks on six real-world datasets: Mutagenicity, PTC MR, PTC MM, PTC FM, AIDS, and NCI-H23. The details of six datasets and experimental settings are represented in Appendix B and C. Table 5: Statistics and properties of four real-world molecule datasets. Mutagenicity (Kazius et al., 2005; Riesen & Bunke, 2008) PTC MR, PTC MM, and PTC FM (Morris et al., 2020) AIDS (Morris et al., 2020) NCI-H23 (Morris et al., 2020)
Dataset Splits No The paper does not explicitly provide specific training/test/validation dataset splits (e.g., percentages, sample counts, or explicit references to predefined splits with citations for splits). It mentions that a target model is 'pre-trained' and uses 'all datasets' but no splitting methodology is described for the experiments conducted with MAGE.
Hardware Specification Yes The CPU in our setup is an AMD Ryzen Threadripper 2990WX, accompanied by 256 GB of memory, and the GPU is an RTX 4090.
Software Dependencies No The paper mentions 'PyTorch Geometric library' and 'RDKit library' but does not specify their version numbers. It also mentions 'Adam optimizer' but this is an algorithm, not a software library with a version.
Experiment Setup Yes For the target GNN, we use a 3-layer GCN as a feature extractor and a 2-layer MLP as a classifier on all datasets. The model is pre-trained and achieves reasonable performances on all datasets. Following the (Xu et al., 2018), the hidden dimension is selected from {16, 64}. We employ mean-pooling as the readout function and ReLU as the activation function. We use Adam optimizer for training. The target model is trained for 100 epochs, and the learning rate is set to 0.01. For training an explainer, the number of epochs is selected from {50, 100}, the learning rate is selected from {0.01, 0.0001, 0.0005}, the batch size is selected from {32, 320}.