Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Large Language Models Enhanced Personalized Graph Neural Architecture Search in Federated Learning

Authors: Hui Fang, Yang Gao, Peng Zhang, Jiangchao Yao, Hongyang Chen, Haishuai Wang

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive evaluations show that PFGNAS significantly outperforms traditional PFL methods, highlighting the advantages of integrating LLMs into personalized federated learning environments.
Researcher Affiliation	Academia	1Zhejiang Key Laboratory of Accessible Perception and Intelligent Systems, College of Computer Science, Zhejiang University, China 2Cyberspace Institute of Advanced Technology, Guangzhou University, China 3Cooperative Medianet Innovation Center, Shanghai Jiaotong University, China 4Research Center for Data Hub and Security, Zhejiang Lab, China EMAIL, EMAIL, EMAIL, EMAIL
Pseudocode	No	The paper describes the methodology in text and mathematical formulations but does not present a structured pseudocode or algorithm block.
Open Source Code	Yes	Code https://github.com/Hui Fang-hub/PFGNAS.
Open Datasets	Yes	To validate the efficacy of our proposed methodology, we conducted simulations of federated learning scenarios utilizing three widely-recognized datasets, namely, Cora, Citeseer, and Pubmed.
Dataset Splits	Yes	The partitioning of each dataset into N clients employs two distinct partitioning strategies (Yurochkin et al. 2019). First, We implemented a homogeneous partitioning scheme, ensuring that each client possesses an approximately equal distribution across the K classes, achieved through Dirichlet distribution sampling with pk Dir N (β=10). In contrast, a heterogeneous partitioning approach was employed by simulating pk Dir N (β=0.2) and allocating a proportion pk,N of class k instances to N clients.
Hardware Specification	Yes	We run all experiments for three random repetitions on an NVIDIA RTX 3090 GPU.
Software Dependencies	No	In particular, we use GLM4 as the default LLM, and we also compared it to GPT with a temperature τ = 0.5. Besides, we choose the Adam optimizer with a learning rate of 1e-3. The paper mentions specific LLM models (GLM4, GPT) and an optimizer (Adam) but does not provide version numbers for these or for other key software components like programming languages or libraries (e.g., Python, PyTorch, CUDA).
Experiment Setup	Yes	In our federated learning setting, we set the number of clients N to be [3, 5, 10, 20] respectively, and the total round number to be 100. In particular, we use GLM4 as the default LLM, and we also compared it to GPT with a temperature τ = 0.5. Besides, we choose the Adam optimizer with a learning rate of 1e-3. The number of GNN layers to 2.