Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Empowering GNNs via Edge-Aware Weisfeiler-Leman Algorithm

Authors: Meng Liu, Haiyang Yu, Shuiwang Ji

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments demonstrate that our NC-GNN performs effectively and efficiently on various benchmarks. In this section, we evaluate the effectiveness of the proposed NC-GNN model on real benchmarks.
Researcher Affiliation	Academia	Meng Liu EMAIL Department of Computer Science & Engineering Texas A&M University
Pseudocode	Yes	Algorithm 1 NC-1-WL vs. 1-WL for graph isomorphism test Algorithm 2 k-WL for graph isomorphism test
Open Source Code	No	The paper states: "Our implementation is based on the Py G library (Fey & Lenssen, 2019)." This refers to a third-party library used, not a specific statement or link for the authors' own implementation code for the work described in this paper.
Open Datasets	Yes	we consider widely used datasets from TUDatasets (Morris et al., 2020a), Open Graph Benchmark (OGB) (Hu et al., 2020), and GNN Benchmark (Dwivedi et al., 2020).
Dataset Splits	Yes	We use the same number of layers as GIN and report the 10-fold cross-validation accuracy following the protocol as (Xu et al., 2019) for a fair comparison. Results over 10 random runs are reported. Average results over 4 random runs are reported in Table 3. As (Chen et al., 2020; Zhao et al., 2022), we use the same 30%/20%/50% (training/validation/test) split.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU models, CPU types, or memory amounts used for running its experiments. While Table 5 mentions "GPU Memory", it does not specify the type of GPU.
Software Dependencies	No	The paper mentions that its implementation is based on the "Py G library (Fey & Lenssen, 2019)", but it does not specify version numbers for PyG itself, Python, PyTorch, or any other critical software dependencies required for replication.
Experiment Setup	Yes	The detailed model configurations and training hyperparameters of NC-GNN on each dataset are summarized in Table 9, Appendix C.3. For the model architecture, we tune the following configurations; those are (1) the number of layers, (2) the number of hidden dimensions, (3) using the jumping knowledge (JK) technique or not, and (4) using a residual connection or not. In terms of training, we consider tuning the following hyperparameters. those are (1) the initial learning rate, (2) the step size of learning rate decay, (3) the multiplicative factor of learning rate decay, (4) the batch size, (5) the dropout rate, and (6) the total number of epochs.