Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

MOTION: Multi-Sculpt Evolutionary Coarsening for Federated Continual Graph Learning

Authors: Frank Wan, Fengyuan Ran, Ruikang Zhang, Wenke Huang, Xuankun Rong, Guibin Zhang, Yuxin Wu, Bo Du, Mang Ye

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on real-world datasets show that our approach improves average accuracy (AA) by an average of 30% over the Fed Avg baseline across five datasets while maintaining a negative average forgetting (AF) rate, significantly enhancing generalization and robustness under FCGL settings. The code is available for anonymous access at https://github.com/Guancheng Wan/MOTION. 4 Experiment In this section, we comprehensively evaluate MOTION through four axes: Q1 (Superiority), Q2 (Resilience), Q3 (Effectiveness), Q4 (Sensitivity).
Researcher Affiliation	Academia	Guancheng Wan1 , Fengyuan Ran1 , Ruikang Zhang2 , Wenke Huang1, Xuankun Rong1, Guibin Zhang2, Yuxin Wu3, Bo Du1 , Mang Ye1 1Wuhan University 2Tongji University 3Renmin University of China EMAIL
Pseudocode	No	The paper describes the methodology in prose, without structured pseudocode or algorithm blocks.
Open Source Code	Yes	The code is available for anonymous access at https://github.com/Guancheng Wan/MOTION.
Open Datasets	Yes	To effectively evaluate the performance of our approach, we employed five benchmark graph datasets of various scales and distributions, including Cora [35], Cite Seer [16], Pub Med [5], Amazon-Photo, and Coauthor-CS [44]. Detailed descriptions and dataset splits are provided in Appendix C.1.
Dataset Splits	Yes	Each dataset is partitioned into fixed subsets of 20% for training, 40% for validation, and 40% for testing.
Hardware Specification	Yes	All experiments were conducted on a system equipped with an NVIDIA Ge Force RTX 3090 GPU (24 GB), an Intel Xeon Gold 6330 CPU @ 2.00 GHz (14 cores, 28 threads), and 90 GB of RAM, running Ubuntu 22.04 with Python 3.12, Py Torch 2.3.0, and CUDA 12.1.
Software Dependencies	Yes	All experiments were conducted on a system equipped with an NVIDIA Ge Force RTX 3090 GPU (24 GB), an Intel Xeon Gold 6330 CPU @ 2.00 GHz (14 cores, 28 threads), and 90 GB of RAM, running Ubuntu 22.04 with Python 3.12, Py Torch 2.3.0, and CUDA 12.1.
Experiment Setup	Yes	Our GAT architecture consisted of three attention layers with a hidden dimension of 64, a dropout rate of 0.3, a learning rate of 0.005, and a weight decay of 4 10 4. We split each dataset into 20% training, 40% validation, and 40% test subsets, fixed the random seed at 4 for reproducibility, and performed all computations on GPU 0. The federated setup comprised two clients participating in a single communication round using the Fed Avg algorithm. To induce label skew, we partitioned each dataset s eight classes across clients according to a Dirichlet distribution with concentration parameter α = 5.0.