Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Towards Robust Parameter-Efficient Fine-Tuning for Federated Learning

Authors: Xiuwen Fang, Mang Ye

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Comprehensive experimental validation shows RFed LR outperforms existing methods, achieving superior accuracy and robustness in noisy federated scenarios. Our code is available at: https://github.com/Fang Xiuwen/RFed LR
Researcher Affiliation	Academia	1 School of Computer Science, Wuhan University, Wuhan, China 2 Taikang Center for Life and Medical Sciences, Wuhan University, Wuhan, China EMAIL
Pseudocode	Yes	A The Algorithm of RFed LR We summarize the Pseudocode of RFed LR in Algorithm 1.
Open Source Code	Yes	Our code is available at: https://github.com/Fang Xiuwen/RFed LR
Open Datasets	Yes	Following previous works [53, 61], we conduct extensive experiments on CIFAR-100 [62] dataset, which contains 60, 000 color images covering 100 classes.
Dataset Splits	Yes	The FL setup involved K = 5 clients, each holding a non-IID partition of CIFAR-100. The private data is partitioned among clients according to a Dirichlet distribution, with the degree of data heterogeneity controlled by the hyperparameter β, which is set to 0.5. We maintain a mini-batch of clean CIFAR-100 dataset on the server as the public proxy dataset Dc with \|Dc\| = 256.
Hardware Specification	Yes	Experiments are conducted on 4 NVIDIA RTX 3090 GPUs.
Software Dependencies	No	All methods were implemented within a unified Py Torch framework for fair comparison.
Experiment Setup	Yes	Local fine-tuning is performed for one epoch per communication round. For optimization, we use SGD with a learning rate of 0.01, weight decay of 0.0001, momentum of 0.9 and a batch size of 256. Lo RA decomposes weight updates into two low-rank matrices A and B with rank r = 4, and the scaling factor α is set to 4. SRT employed a keep ratio τ = 0.2. For AFLA, the balancing hyperparameter λ is set to 0.4.