Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Bi-Directional Communication-Efficient Stochastic FL via Remote Source Generation

Authors: Maximilian Egger, Rawad Bitar, Antonia Wachter-Zeh, Nir Weinberger, Deniz Gunduz

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Through extensive simulations, we show that our method achieves 5 32 reduction in total communication cost while preserving accuracy. We demonstrate substantial communication savings, reducing total cost by factors of 5 32 while maintaining competitive accuracy across standard benchmarks. Our ablation studies analyze the effects of shared randomness and the choice of side information.
Researcher Affiliation	Academia	Maximilian Egger , Rawad Bitar, Antonia Wachter-Zeh Technical University of Munich Munich, Germany EMAIL Nir Weinberger Israel Institute of Technology Haifa, Israel EMAIL Deniz Gündüz Imperial College London London, United Kingdom EMAIL
Pseudocode	Yes	Algorithm 1 BICOMPFL-GR with Global Randomness, Algorithm 2 BICOMPFL-PR with Private Randomness, Algorithm 3 BICOMPFL-GR-CFL with stochastic quantization Qs( ) and EF21 from Richtárik et al. [2021]
Open Source Code	Yes	The code to reproduce our experiments is included in the supplementary material.
Open Datasets	Yes	We study n = 10 clients ... collaboratively training a convolutional neural network (CNN)-based classifier for the datasets MNIST, Fashion-MNIST and CIFAR-10 under the orchestration of a federator.
Dataset Splits	No	We study n = 10 clients ... collaboratively training a convolutional neural network (CNN)-based classifier for the datasets MNIST, Fashion-MNIST and CIFAR-10... We evaluate the schemes in two settings: with a uniform data allocation (i.i.d.), to model homogeneous systems, and with a non-i.i.d. allocation, to model heterogeneous systems, where data allocation for each client is drawn from a Dirichlet distribution with parameter α = 0.1.
Hardware Specification	Yes	Table 1: System specifications of our simulation cluster. CPU(s) RAM GPU(s) VRAM 2x Intel Xeon Platinum 8176 (56 cores) 256 GB 2x NVIDIA Ge Force GTX 1080 Ti 11 GB 2x AMD EPYC 7282 (32 cores) 512 GB NVIDIA Ge Force RTX 4090 24 GB 2x AMD EPYC 7282 (32 cores) 640 GB NVIDIA Ge Force RTX 4090 24 GB 2x AMD EPYC 7282 (32 cores) 448 GB NVIDIA Ge Force RTX 4080 16 GB 2x AMD EPYC 7282 (32 cores) 256 GB NVIDIA Ge Force RTX 4080 16 GB HGX-A100 (96 cores) 1 TB 4x NVIDIA A100 80 GB DGX-A100 (252 cores) 2 TB 8x NVIDIA Tesla A100 80 GB DGX-1-V100 (76 cores) 512 GB 8x NVIDIA Tesla V100 16 GB DGX-1-P100 (76 cores) 512 GB 8x NVIDIA Tesla P100 16 GB HPE-P100 (28 cores) 256 GB 4x NVIDIA Tesla P100 16 GB
Software Dependencies	No	We use Adam [Kingma and Ba, 2015] as an optimizer with learning rate η = 0.0003 for all non-stochastic methods, and η = 0.1 for probabilistic mask training. For non-stochastic FL, we use a federator (server) learning rate of 0.1, i.e., the clients gradients are averaged, and the federator updates the global model with learning rate 0.1, and with a learning rate of 0.005 for BICOMPFL-GR with Sign SGD. For M3, we use a federator learning rate of 0.02 to obtain reliable results. For LIEC and CSER, we use an average period of 50 global iterations (cf. [Cheng et al., 2024, Xie et al., 2020]). For M3, we use Top K with K = d/n . The paper also states in the NeurIPS checklist, "We also provide the Python environment used to generate the results," but this is not specific enough to meet the criteria for versioned software dependencies within the paper's content.
Experiment Setup	Yes	We use the cross-entropy loss and a batch size of 128 in all our experiments. We use Adam [Kingma and Ba, 2015] as an optimizer with learning rate η = 0.0003 for all non-stochastic methods, and η = 0.1 for probabilistic mask training. For non-stochastic FL, we use a federator (server) learning rate of 0.1, i.e., the clients gradients are averaged, and the federator updates the global model with learning rate 0.1, and with a learning rate of 0.005 for BICOMPFL-GR with Sign SGD. For M3, we use a federator learning rate of 0.02 to obtain reliable results. For LIEC and CSER, we use an average period of 50 global iterations (cf. [Cheng et al., 2024, Xie et al., 2020]). For M3, we use Top K with K = d/n . We train MNIST and Fashion-MNIST for 200 global iterations and CIFAR-10 for 400 global iterations. Through all experiments and datasets, we carry L = 3 local iterations per client.