Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

An Efficient and Accurate Dynamic Sparse Training Framework Based on Parameter-Freezing

Authors: Lei Li, Haochen Yang, Jiacheng Guo, Hongkai Yu, Minghai Qin, Tianyun Zhang

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results show that the model accuracy has significantly improved when combining our proposed methods. For example, compared with the previous state-of-the-art methods with the same total amount of communication cost and computation FLOPs, the accuracy increases on average by 4% and 6% in our methods for CIFAR-10 and CIFAR-100 datasets on Res Net-18, respectively. On the other hand, when targeting the same accuracy, the proposed method can reduce the communication cost by 4-8 times for different datasets with different sparsity levels.
Researcher Affiliation	Collaboration	Lei Li1, Haochen Yang1, Jiacheng Guo1, Hongkai Yu1, Minghai Qin1,2 , Tianyun Zhang1 1Cleveland State University, Cleveland, USA 2Western Digital Research, Milpitas, USA
Pseudocode	Yes	Algorithm 1: Mask readjustment on the server with differential sparsity Algorithm 2: Parameter-freezing-based dynamic sparse training
Open Source Code	Yes	1Code and Appendix: https://github.com/Dawns14/pffdst.git
Open Datasets	Yes	This paper evaluates the performance of a proposed framework, PFFDST1, against established FL techniques on CIFAR-10 (Krizhevsky, Hinton et al. 2009) and CIFAR-100 (Krizhevsky, Hinton et al. 2009) datasets using Le Net (Le Cun et al. 1998) and Res Net-18 (He et al. 2016) models.
Dataset Splits	Yes	The datasets are partitioned across multiple clients using a Dirichlet distribution (α = 0.1) to simulate non-IID data settings. Le Net experiments are evaluated based on the highest accuracy, aligning with established Fed DST (Bibikar et al. 2022) benchmarks. In the experiments using Res Net-18, we report the average accuracy with error bounds to provide a comprehensive performance assessment. These error bounds, derived from multiple experimental runs, offer a robust measure of the variability in performance and enhance the statistical significance of the reported average accuracy.
Hardware Specification	Yes	The implementation leverages Py Torch (Paszke et al. 2019) on a server equipped with 8 A6000 GPUs, with detailed hyperparameter settings outlined in Appendix A.
Software Dependencies	No	The implementation leverages Py Torch (Paszke et al. 2019) on a server equipped with 8 A6000 GPUs, with detailed hyperparameter settings outlined in Appendix A. Although PyTorch is mentioned, a specific version number is not provided in the main text.
Experiment Setup	Yes	The number of training epochs is set to 3 for Fed DST and 2 for PFFDST to maintain comparable FLOPs. Additional parameters include R = 10, Rend = total rounds/8, 400 clients for Le Net, 200 clients for Res Net-18, and 20 randomly selected clients per communication round. The implementation leverages Py Torch (Paszke et al. 2019) on a server equipped with 8 A6000 GPUs, with detailed hyperparameter settings outlined in Appendix A. PFFDST configurations utilize two sparsity levels: s1 = (s + 1)/2 and s2 = s, with differential sparsities f1 = f2 = (1 s)/4 to control communication overhead.