A New Theoretical Perspective on Data Heterogeneity in Federated Optimization
Authors: Jiayi Wang, Shiqiang Wang, Rong-Rong Chen, Mingyue Ji
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our findings are validated using experiments. In this section, we present experimental results obtained from various datasets and models to validate our theoretical findings. |
| Researcher Affiliation | Collaboration | 1Department of Electrical and Computer Engineering, University of Utah, Salt Lake City, UT, USA. 2IBM T. J. Watson Research Center, Yorktown Heights, NY, USA. |
| Pseudocode | Yes | Algorithm 1 Fed Avg with momentum (Algorithm 1 in Yu et al. (2019a)) ... Algorithm 2 Fed Adam in (Reddi et al., 2020) |
| Open Source Code | No | The paper does not provide any statements about releasing the source code for the described methodology. |
| Open Datasets | Yes | In this section, we present experimental results obtained from various datasets and models to validate our theoretical findings. In particular, we estimate L, Lh and Lg on MNIST (Le Cun et al., 1998) with multilayer perceptron (MLP), CIFAR-10 (Krizhevsky & Hinton, 2009) with CNN and VGG-11, CIFAR-100 with VGG-16. ...Results with synthetic data for quadratic objective functions are also provided... |
| Dataset Splits | No | The paper describes how training data is partitioned among workers and how non-IID data is generated ('X% of the data on one worker is sampled from a single label'), but does not specify explicit train/validation/test splits for the datasets themselves (e.g., '80% training, 10% validation, 10% test'). |
| Hardware Specification | Yes | Environment. All of our experiments are implemented in Py Torch and run on a server with four NVIDIA 2080Ti GPUs. |
| Software Dependencies | No | All of our experiments are implemented in Py Torch... The paper mentions using PyTorch but does not provide specific version numbers for PyTorch or any other software libraries or dependencies. |
| Experiment Setup | Yes | The mini-batch size of SGD for MNIST and CIFAR-10 is 20. The mini-batch size of SGD for CIFAR-100 is 32. ...For CNN, the learning rates are chosen as η = 2 and γ = 0.05. For MLP, the learning rates are chosen as η = 2 and γ = 0.1. |