Achieving Linear Speedup with Partial Worker Participation in Non-IID Federated Learning
Authors: Haibo Yang, Minghong Fang, Jia Liu
ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct extensive experiments on MNIST and CIFAR-10 to verify our theoretical results. |
| Researcher Affiliation | Academia | Haibo Yang, Minghong Fang, and Jia Liu Department of Electrical and Computer Engineering The Ohio State University Columbus, OH 43210 USA {yang.5952, fang.841, liu.1736}@osu.edu |
| Pseudocode | Yes | Algorithm 1 A Generalized Fed Avg Algorithm with Two-Sided Learning Rates. |
| Open Source Code | No | The paper does not provide an explicit link to open-source code for the described methodology. |
| Open Datasets | Yes | We conduct extensive experiments on MNIST and CIFAR-10 to verify our theoretical results. We use three models: logistic regression (LR), a fully-connected neural network with two hidden layers (2NN) and a convolution neural network (CNN) with the non-i.i.d. version of MNIST (Le Cun et al., 1998) and one Res Net model with CIFAR-10 (Krizhevsky et al., 2009). |
| Dataset Splits | No | The paper discusses training and testing samples but does not specify details for a validation split (e.g., percentages or counts for a validation set). |
| Hardware Specification | Yes | We run the experiments using the same GPU (NVIDIA V100) to ensure the same conditions. |
| Software Dependencies | No | The paper does not provide specific version numbers for ancillary software dependencies. |
| Experiment Setup | Yes | In this section, we elaborate the results under non-i.i.d. MNIST datasets for the 2NN. We distribute the MNIST dataset among m = 100 workers randomly and evenly in a digit-based manner such that the local dataset for each worker contains only a certain class of digits. The number of digits in each worker s dataset represents the non-i.i.d. degree. For digits_10, each worker has training/testing samples with ten digits from 0 to 9, which is essentially an i.i.d. case. For digits_1, each worker has samples only associated with one digit, which leads to highly non-i.i.d. datasets among workers. For partial worker participation, we set the number of workers n = 10 in each communication round. |