Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

MAGE: Model-Level Graph Neural Networks Explanations via Motif-based Graph Generation

Authors: Zhaoning Yu, Hongyang Gao

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our proposed method s effectiveness is demonstrated through quantitative and qualitative assessments conducted on six real-world molecular datasets. The implementation of our method can be found at https://github.com/ZhaoningYu1996/MAGE.
Researcher Affiliation	Academia	Zhaoning Yu Department of Computer Science Iowa State University Ames, IA 50010, USA EMAIL Hongyang Gao Department of Computer Science Iowa State University Ames, IA 50010, USA EMAIL
Pseudocode	Yes	Algorithm 1 Decoding a Molecule from a Tree Structure
Open Source Code	Yes	The implementation of our method can be found at https://github.com/ZhaoningYu1996/MAGE.
Open Datasets	Yes	We evaluate the proposed methods using molecule classification tasks on six real-world datasets: Mutagenicity, PTC MR, PTC MM, PTC FM, AIDS, and NCI-H23. The details of six datasets and experimental settings are represented in Appendix B and C. Table 5: Statistics and properties of four real-world molecule datasets. Mutagenicity (Kazius et al., 2005; Riesen & Bunke, 2008) PTC MR, PTC MM, and PTC FM (Morris et al., 2020) AIDS (Morris et al., 2020) NCI-H23 (Morris et al., 2020)
Dataset Splits	No	The paper does not explicitly provide specific training/test/validation dataset splits (e.g., percentages, sample counts, or explicit references to predefined splits with citations for splits). It mentions that a target model is 'pre-trained' and uses 'all datasets' but no splitting methodology is described for the experiments conducted with MAGE.
Hardware Specification	Yes	The CPU in our setup is an AMD Ryzen Threadripper 2990WX, accompanied by 256 GB of memory, and the GPU is an RTX 4090.
Software Dependencies	No	The paper mentions 'PyTorch Geometric library' and 'RDKit library' but does not specify their version numbers. It also mentions 'Adam optimizer' but this is an algorithm, not a software library with a version.
Experiment Setup	Yes	For the target GNN, we use a 3-layer GCN as a feature extractor and a 2-layer MLP as a classifier on all datasets. The model is pre-trained and achieves reasonable performances on all datasets. Following the (Xu et al., 2018), the hidden dimension is selected from {16, 64}. We employ mean-pooling as the readout function and ReLU as the activation function. We use Adam optimizer for training. The target model is trained for 100 epochs, and the learning rate is set to 0.01. For training an explainer, the number of epochs is selected from {50, 100}, the learning rate is selected from {0.01, 0.0001, 0.0005}, the batch size is selected from {32, 320}.