Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Maximizing Global Model Appeal in Federated Learning

Authors: Yae Jee Cho, Divyansh Jhunjhunwala, Tian Li, Virginia Smith, Gauri Joshi

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section we evaluate Max FL for a number of different datasets while comparing with a wide range of baselines to show that maximizing GM-Appeal, i.e., training a global model that can appeal to a larger number of clients, provide many benefits for FL including: i) the server gaining more participating clients to select clients from for training a better global model for the seen clients, ii) the global model having a higher chance to have a good performance on unseen clients, and iii) clients gaining better performance with the global model when they combine Max FL with local fine-tuning.
Researcher Affiliation	Academia	Yae Jee Cho EMAIL Carnegie Mellon University Divyansh Jhunjhunwala EMAIL Carnegie Mellon University Tian Li EMAIL Carnegie Mellon University University of Chicago Virginia Smith EMAIL Carnegie Mellon University Gauri Joshi EMAIL Carnegie Mellon University
Pseudocode	Yes	Algorithm 1 Our Proposed Max FL Solver
Open Source Code	Yes	The code used for all experiments is included in the supplementary material.
Open Datasets	Yes	We evaluate Max FL in three different settings: image classification for non-iid partitioned (i) FMNIST (Xiao et al., 2017), (ii) EMNIST with 62 labels (Cohen et al., 2017), and (iii) sentiment analysis for (iv) Sent140 (Go et al., 2009)with a MLP.
Dataset Splits	Yes	The data of each client is partitioned to 60% : 40% for training and test data ratio unless mentioned otherwise.
Hardware Specification	Yes	All experiments are conducted on clusters equipped with one NVIDIA Titan X GPU.
Software Dependencies	Yes	The algorithms are implemented in Py Torch 1. 11. 0.
Experiment Setup	Yes	Specifically, we do a grid search over the learning rate: ηlηg {0.1, 0.05, 0.01, 0.005, 0.001}, batchsize: b {32, 64, 128}, and local iterations: τ {10, 30, 50} to find the hyper-parameters with the highest test accuracy for each benchmark.