Tackling the Objective Inconsistency Problem in Heterogeneous Federated Optimization
Authors: Jianyu Wang, Qinghua Liu, Hao Liang, Gauri Joshi, H. Vincent Poor
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 6 Experimental Results. We evaluate all algorithms on two setups with non-IID data partitioning: (1) Logistic Regression on a Synthetic Federated Dataset: The dataset Synthetic(1, 1) is originally constructed in [38]. (2) DNN trained on a Non-IID partitioned CIFAR-10 dataset: We train a VGG-11 [52] network on the CIFAR10 dataset [53], which is partitioned across 16 clients using a Dirichlet distribution Dir16(0.1), as done in [54]. The original CIFAR-10 test set (without partitioning) is used to evaluate the generalization performance of the trained global model. |
| Researcher Affiliation | Academia | Jianyu Wang Carnegie Mellon University Pittsburgh, PA 15213 jianyuw1@andrew.cmu.edu Qinghua Liu Princeton University Princeton, NJ 08544 qinghual@princeton.edu Hao Liang Carnegie Mellon University Pittsburgh, PA 15213 hliang2@andrew.cmu.edu Gauri Joshi Carnegie Mellon University Pittsburgh, PA 15213 gaurij@andrew.cmu.edu H. Vincent Poor Princeton University Princeton, NJ 08544 poor@princeton.edu |
| Pseudocode | Yes | We provide its pseudo-code in the Appendix. |
| Open Source Code | Yes | Our code is available at: https://github.com/JYWa/FedNova. |
| Open Datasets | Yes | The dataset Synthetic(1, 1) is originally constructed in [38]. ... We train a VGG-11 [52] network on the CIFAR10 dataset [53], which is partitioned across 16 clients using a Dirichlet distribution Dir16(0.1), as done in [54]. |
| Dataset Splits | No | The paper mentions training on CIFAR-10 and evaluating on the CIFAR-10 test set, but does not explicitly state or describe a validation set or how data was split for training, validation, and testing. |
| Hardware Specification | No | The paper does not provide specific details on the hardware used for running the experiments, such as GPU or CPU models. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers. |
| Experiment Setup | Yes | The local learning rate η is decayed by a constant factor after finishing 50% and 75% of the communication rounds. The initial value of η is tuned separately for Fed Avg with different local solvers. On CIFAR-10, we run each experiment with 3 random seeds and report the average and standard deviation. In Fed Prox, we set µ = 1, the best value reported in [38]. Clients perform GD with η = 0.05, which is decayed by a factor of 5 at rounds 600 and 900. Middle: Only C = 0.3 fraction of clients are randomly selected per round to perform Ei = 5 local epochs. Right: Only C = 0.3 fraction of clients are randomly selected per round to perform random and time-varying local epochs Ei(t) U(1, 5). |