Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Multi-order Orchestrated Curriculum Distillation for Model-Heterogeneous Federated Graph Learning

Authors: Frank Wan, Xu Cheng, Run Liu, Wenke Huang, Zitong Shi, Pinyi Jin, Guibin Zhang, Bo Du, Mang Ye

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on multiple graph benchmarks and model-heterogeneous settings show that TRUST outperforms existing methods, achieving an average 3.6% performance gain, particularly under moderate heterogeneity conditions. In this section, we comprehensively evaluate TRUST through four axes: Q1 (Superiority), Q2 (Resilience). Q3 (Effectiveness), Q4 (Sensitivity),
Researcher Affiliation	Academia	Guancheng Wan1 , Xu Cheng1 , Run Liu1, Wenke Huang1, Zitong Shi1, Pinyi Jin1, Guibin Zhang2, Bo Du1 , Mang Ye1 1Wuhan University 2NUS EMAIL
Pseudocode	No	The paper describes the methodology and its components (PCNS, ACDM, WDAD) using textual descriptions and mathematical formulas, but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	The code is available for anonymous access at https://github.com/Guancheng Wan/TRUST.
Open Datasets	Yes	To effectively evaluate the performance of our approach, we employed five benchmark graph datasets of various scales and distributions, including Cora [31], Cite Seer [7], Pub Med [38], CS, and Photo. Detailed descriptions and splits for these datasets can be found in Appendix C.1. Cora, Cite Seer, and Pub Med. These three citation network datasets are standard benchmarks in graph-based machine learning, especially for tasks like node classification and link prediction.
Dataset Splits	Yes	Each dataset is split into training, validation, and test sets in a fixed 20%/40%/40% ratio. The key statistics of these datasets are summarized in Tab. 3. Each client s subgraph is split into training, validation, and test sets with a ratio of 0.6/0.2/0.2, respectively.
Hardware Specification	Yes	The experiments are conducted on NVIDIA GeForce RTX 3090 GPUs, paired with dual Intel(R) Xeon(R) Gold 6240 CPUs @ 2.60GHz (36 cores per socket, Turbo Boost up to 3.90GHz).
Software Dependencies	Yes	The deep learning framework used is Py Torch (v2.5.1) with CUDA 12.1.
Experiment Setup	Yes	The experimental setup involves 10 clients. To simulate real-world model heterogeneity, each client maintains a private model whose architecture is randomly selected from GCN, GAT, or Graph SAGE. All private models are configured with three layers, a hidden dimension of 64, and a dropout rate of 0.3. To facilitate collaboration, each client is equipped with an additional small proxy model that serves as a communication bridge. This proxy model employs a standardized GCN architecture with 3 layers to ensure compatibility across clients. On the server side, we implement a global model that also adopts a GCN backbone and uses a hidden dimension of 32 while sharing the remaining configurations with the client models. ... We use the Adam optimizer with a learning rate of 5 x 10^-3 and a weight decay of 4 x 10^-4 for training. The number of communication rounds is set to 200. For PCNS, in the Difficulty Measurer, we set α = 0.5. In the Curriculum Scheduler, the hyperparameters λ and T are selected via grid search over {0.25, 0.5, 0.75} and {20, 40, 80, 100}, respectively. For ACDM, the model parameters θmodel are optimized using Adam (learning rate: 0.01, weight decay: 5 x 10^-4), while the temperature parameters θtemp are optimized using SGD (momentum: 0.9, weight decay: 4 x 10^-4). The conformal mode parameters in backward distillation follow the configuration in Fed Type. In WDAD, we set η = 0.05, κ = 1.0, with loss weights α = 0.025 (Wasserstein) and β = 0.01 (KL divergence).