Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Understanding and Enhancing Message Passing on Heterophilic Graphs via Compatibility Matrix

Authors: Zhuonan Zheng, Yuanchen Bei, Zhiyao Zhou, Sheng Zhou, Yao Ma, Ming Gu, HONGJIA XU, Jiawei Chen, Jiajun Bu

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	A thorough evaluation involving 13 datasets and comparison against 20 well-established baselines highlights the superiority of CMGNN. ... We conduct fair comparisons to evaluate the effectiveness of CMGNN, compared with 20 baseline methods on 13 datasets with varying homophily levels and scales. Extensive experimental results demonstrate that CMGNN outperforms all baseline methods on heterophilic graphs while also being competitive on homophilic graphs.
Researcher Affiliation	Academia	Zhejiang Key Laboratory of Accessible Perception and Intelligent Systems, Zhejiang University College of Computer Science and Technology, Zhejiang University School of Software Technology, Zhejiang University School of Science, Rensselaer Polytechnic Institute
Pseudocode	Yes	More details of CMGNN including pseudo-code are available in Appendix E. ... Algorithm 1 Algorithm of CMGNN
Open Source Code	Yes	Our code is available at https://github.com/zfx233/CMGNN.
Open Datasets	Yes	The newly organized datasets include (i) small-scale: Roman-Empire, Amazon-Ratings, Chameleon-F, Squirrel-F, Actor, Flickr, Blog Catalog and Pubmed; (ii) large-scale: Penn94, Twitch-Gamer, Genius, Pokec and Snap-Patents. ... 2https://github.com/yandex-research/heterophilous-graphs/tree/main/data ... 3https://github.com/bingzhewei/geom-gcn/tree/master/new_data/film ... 4https://github.com/Trust AGI-Lab/Co LA/tree/main/raw_dataset ... 5https://linqs.soe.ucsc.edu/datac ... 6https://github.com/CUAI/Non-Homophily-Large-Scale/tree/master/data
Dataset Splits	Yes	For consistency with existing methods, we randomly construct 10 splits with predefined proportions (48% / 32% / 20% for training / validation / test) for each dataset and report the mean accuracy and standard deviation of 10 splits.
Hardware Specification	Yes	We run these experiments on NVIDIA Ge Force RTX 3090 GPUs with 24G memory.
Software Dependencies	No	The codebase is based on the widely used Py Torch7 framework, supporting both DGL8 and Py G9.
Experiment Setup	Yes	CMGNN has the same experimental settings within the benchmark, including datasets, splits, evaluations, hardware, optimizer, and so on. Parameters Search Space. We list the search space of parameters in Table 7, where patience is for the maximum epoch early stopping, n_hidden is the embedding dimension of hidden layers as well as the representation dimension dr, relu_varient decides Re LU applying before message aggregation or not as in Luan et al. [12], structure_info determines whether to use structure information as supplement node features or not. Table 7: Parameter search space of our method. Parameters Range learning rate {0.001, 0.005, 0.01, 0.05} weight_decay {0, 1e-7, 5e-7, 1e-6, 5e-6, 5e-5, 5e-4} patience {200, 400} dropout [0, 0.9] λ {0, 0.01, 0.1, 1, 10} layers {1, 2, 4, 8} n_hidden {32, 64, 128, 256} relu_variant {True, False} structure_info {True, False}