Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Making Classic GNNs Strong Baselines Across Varying Homophily: A Smoothness–Generalization Perspective

Authors: Ming Gu, Zhuonan Zheng, Sheng Zhou, Meihan Liu, Jiawei Chen, Qiaoyu Tan, Liangcheng Li, Jiajun Bu

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Benchmarking against 30 baselines demonstrates IGNN s superiority and reveals notable universality in certain homophilic GNN variants. Our code and datasets are available at https://github.com/galogm/IGNN. ... Benchmark and Empirical Findings. We establish a comprehensive benchmark consisting of 30 representative baselines to assess the effectiveness of our design principles. Our results demonstrate that not only can classic GCNs enhanced with these principles achieve state-of-the-art (SOTA) performance, but also that certain existing homo GNNs inherently possess universal capabilities.
Researcher Affiliation	Academia	Zhejiang Key Laboratory of Accessible Perception and Intelligent Systems, Zhejiang University College of Computer Science and Technology, Zhejiang University School of Software Technology, Zhejiang University China University of Mining and Technology Department of Computer Science, New York University Shanghai Corresponding Author: EMAIL.
Pseudocode	No	The paper describes the methodology in text and mathematical formulations but does not include any clearly labeled pseudocode or algorithm blocks. The structure of IGNN is described in Section 5.1 Inceptive GNN Framework (IGNN) and its variants in Table 2, but these are descriptions and not pseudocode.
Open Source Code	Yes	Our code and datasets are available at https://github.com/galogm/IGNN.
Open Datasets	Yes	Datasets: Following recent works [54], we select 13 representative datasets of various sizes, excluding those too small or class-imbalanced [27]: (i) Heterophily: Roman-empire, Blog Catalog, Flickr, Actor, Squirrel-filtered, Chameleon-filtered, Amazon-ratings, Pokec; (ii) Homophily: Pub Med, Photo, wikics, ogbn-arxiv, ogbn-products. The statistics are in Table 3 and 4.
Dataset Splits	Yes	Settings: We randomly construct 10 splits with proportions of 48%/32%/20% for training/validation/testing, which is guided by our theoretical emphasis on generalization. Prior work [40] has shown that different splitting strategies can lead to substantial variations in structural distributions, thereby influencing generalization behavior. To mitigate this, we adopt a unified split scheme [19, 22], reducing variance across datasets that may arise from the heterogeneous splitting policies used in earlier studies. For the large-size datasets (ogbn-arxiv, Pokec, and ogbn-products), we use the public splits.
Hardware Specification	No	The paper discusses runtime efficiency in Appendix B.3 and mentions evaluating training efficiency, but it does not specify any particular hardware components such as GPU models, CPU models, or memory specifications used for the experiments.
Software Dependencies	No	The network is optimized using the Adam [55], with hyperparameter settings provided in Appendix E.2. The paper mentions the Adam optimizer but does not specify programming languages or versions of libraries/frameworks (e.g., Python, PyTorch, TensorFlow) used in the implementation.
Experiment Setup	Yes	Settings: We randomly construct 10 splits with proportions of 48%/32%/20% for training/validation/testing... The network is optimized using the Adam [55], with hyperparameter settings provided in Appendix E.2.