Improving the Model Consistency of Decentralized Federated Learning
Authors: Yifan Shi, Li Shen, Kang Wei, Yan Sun, Bo Yuan, Xueqian Wang, Dacheng Tao
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirically, our methods can achieve competitive performance compared with CFL methods and outperform existing DFL methods. ... Empirically, we conduct extensive experiments on CIFAR-10 and CIFAR-100 datasets in both identical data distribution (IID) and non-IID settings. |
| Researcher Affiliation | Collaboration | 1Tsinghua University, Shenzhen, China 2JD Explore Academy, Beijing, China 3Hong Kong Polytechnic University, Hong Kong, China 4The University of Sydney, Australia. |
| Pseudocode | Yes | Algorithm 1 DFed SAM and DFed SAM-MGS |
| Open Source Code | No | The paper does not contain an explicit statement about the release of source code or a direct link to a code repository for the described methodology. |
| Open Datasets | Yes | The efficacy of the proposed DFed SAM and DFed SAM-MGS is evaluated on CIFAR-10 and CIFAR-100 datasets (Krizhevsky et al., 2009) in both IID and non-IID settings. |
| Dataset Splits | Yes | Specifically, Dirichlet Partition (Hsu et al., 2019) and Pathological Partition are used for simulating non-IID across federated clients, where the former partitions the local data of each client by splitting the total dataset through sampling the label ratios from the Dirichlet distribution Dir(α) with parameters α = 0.3 and α = 0.6. And the Pathological Partition is placed in Appendix C due to limited space. ... In Appendix C: Where the sorted data is divided into 200 partitions with 100 clients and each client is randomly assigned 2 partitions from 2 classes. |
| Hardware Specification | No | The paper does not provide specific details regarding the hardware used for running the experiments (e.g., GPU/CPU models, specific server configurations). |
| Software Dependencies | No | The paper mentions models like VGG-11 and ResNet-18, and the SGD optimizer, but does not specify versions for any programming languages, libraries, or frameworks (e.g., Python, PyTorch, TensorFlow, CUDA). |
| Experiment Setup | Yes | The total number of clients is set to 100, among which 10% clients participates in communication. ... We initialize the local learning rate to 0.1 with a decay rate 0.998 per communication round for all experiments. ... The batch size is fixed to 128 for all the experiments. We run 1000 global communication rounds for CIFAR-10 and CIFAR-100. ... Other optimizer hyper-parameters ρ = 0.01 for our algorithms (DFed SAM and DFed SAM-MGS with Q = 1)... For local iterations K, the training epoch in D-PSGD is set to 1, that for all other methods is set to 5. |