Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Personalized Subgraph Federated Learning with Differentiable Auxiliary Projections

Authors: Wei Zhuo, Zhaohuan Zhan, Han Yu

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirical evaluations across diverse graph benchmarks demonstrate that Fed Aux substantially outperforms existing baselines in both accuracy and personalization performance.
Researcher Affiliation	Academia	Wei Zhuo1, Zhaohuan Zhan2, Han Yu1 1Nanyang Technological University, 2Shenzhen MSU-BIT University
Pseudocode	Yes	Appendix B shows the pseudo code of Fed Aux.
Open Source Code	Yes	The code is available at https://github.com/Jhuo W/Fed Aux.
Open Datasets	Yes	Specifically, we perform experiments on six widely used datasets, including four citation networks (Cora, Cite Seer, Pubmed [25], and ogbn-arxiv [10]) and two product co-purchase networks (Amazon-Computer and Amazon-Photo [19, 26]).
Dataset Splits	Yes	for dataset splitting, we randomly sample 20%/40%/40% of nodes from each subgraph for training, validation, and testing, respectively.
Hardware Specification	Yes	All experiments are executed on a workstation equipped with an NVIDIA Tesla V100 SXM2 GPU (32 GB) running CUDA 12.4.
Software Dependencies	No	All experiments are executed on a workstation equipped with an NVIDIA Tesla V100 SXM2 GPU (32 GB) running CUDA 12.4. (Note: Only CUDA version is specified, no other software or library versions are mentioned.)
Experiment Setup	Yes	We employ Masked GCN [3] to generate node embeddings and sweep the number of GCN layers over L {1, 2, 3}. The hidden dimension is selected from d {64, 128, 256}, and dropout probabilities are set to 0.5. The auxiliary projection vector a is initialized from a Gaussian distribution in Rd . The similarity temperature parameter α is set to 10, and the bandwidth σ is fixed to 1. For the FL schedule, we run T = 100 communication rounds with Q = 1 local epoch on the smaller citation datasets (Cora, Cite Seer, Pubmed). On all other datasets, we set the total number of rounds to T = 200 and the number of local epochs per round to Q = 2.