Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

FedRTS: Federated Robust Pruning via Combinatorial Thompson Sampling

Authors: Hong Huang, Jinhai Yang, Yuan Chen, Jiaxun Ye, Dapeng Wu

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments demonstrate that Fed RTS achieves state-of-the-art performance in computer vision and natural language processing tasks while reducing communication costs, particularly excelling in scenarios with heterogeneous data distributions and partial client participation.
Researcher Affiliation	Academia	Hong Huang, Jinhai Yang, Yuan Chen, Jiaxun Ye, Dapeng Wu Department of Computer Science City University of Hong Kong Hong Kong SAR, China EMAIL EMAIL
Pseudocode	Yes	The overflow of TSAdj is shown in Algorithm 1. ... The details of Fed RTS are shown in Algorithm 2 and Sec. B.2.
Open Source Code	Yes	Our codes are available at: https://github.com/Little0o0/Fed RTS.
Open Datasets	Yes	We conduct experiments on CV tasks using two lightweight models, Res Net18 [13] and Shuffle Net V2 [66], across four well-known image classification datasets: CIFAR-10 [27], CINIC-10 [7], Tiny Image Net [8] and SVHN [43]. For NLP tasks, we use the GPT-2-32M model on the Tiny Stories dataset [10], a language understanding benchmark designed for small language models.
Dataset Splits	Yes	CIFAR-10: It contains 50,000 training images and 10,000 testing images. ... CINIC-10: It contains three equal subsets train, validation, and test each comprising 90,000 images.
Hardware Specification	Yes	The experiments are simulated via multi-process on an Nvidia RTX 5880 GPU, an Intel Core i9 CPU with 48 GB of memory.
Software Dependencies	No	The paper does not explicitly provide specific version numbers for software dependencies.
Experiment Setup	Yes	we conduct Tmax = 500 communication rounds with 5 local training epochs. Batch sizes are set to 64 for CV tasks and 16 for NLP tasks. ... The optimizer used is SGD, with a learning rate of η = 0.01. ... We set the scaling factor λ = 10 ... The trade-off ratio is set as γ = 0.5...