A Unified Analysis of Federated Learning with Arbitrary Client Participation

Authors: Shiqiang Wang, Mingyue Ji

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We also discuss various insights, recommendations, and experimental results. [...] 6 Experiments We ran experiments of training convolutional neural networks (CNNs) with Fashion MNIST [34] and CIFAR-10 [19] datasets, each of which has images in 10 different classes.
Researcher Affiliation Collaboration Shiqiang Wang IBM T. J. Watson Research Center Yorktown Heights, NY 10598 wangshiq@us.ibm.com Mingyue Ji Department of ECE, University of Utah Salt Lake City, UT 84112 mingyue.ji@utah.edu
Pseudocode Yes Algorithm 1: Generalized Fed Avg with amplified updates and arbitrary participation
Open Source Code Yes Please visit https://shiqiang.wang/code/fl-arbitrary-participation
Open Datasets Yes We ran experiments of training convolutional neural networks (CNNs) with Fashion MNIST [34] and CIFAR-10 [19] datasets
Dataset Splits No The paper mentions that training details, including data splits, are included in the appendix, but the main text does not explicitly provide specific percentages or counts for training, validation, or test splits. It refers to 'initial training' rounds but not explicit validation sets or their sizes.
Hardware Specification No The main paper states that
Software Dependencies No The paper does not provide specific version numbers for software dependencies used in the experiments. It mentions using Python and related libraries implicitly through the context of machine learning, but no explicit versions are stated.
Experiment Setup Yes The initial rates are γ = 0.1 and γ = 0.05 without amplification (i.e., η = 1) for Fashion MNIST and CIFAR-10, respectively, which were obtained using grid search in a separate scenario of always participation. After an initial training of 2, 000 rounds for Fashion MNIST and 4, 000 rounds for CIFAR-10, we study the performance of different approaches with their own learning rates. [...] When using amplification, we set η = 10 and P = 500. [...] The best learning rate γ of each approach was separately found on a grid that is {1, 0.1, 0.01, 0.001, 0.0001} times the initial learning rate.