Exploiting Label Skews in Federated Learning with Model Concatenation
Authors: Yiqun Diao, Qinbin Li, Bingsheng He
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results demonstrate that Fed Concat achieves significantly higher accuracy than previous state-of-the-art FL methods in various heterogeneous label skew distribution settings and meanwhile has lower communication costs. Our code is publicly available at https://github.com/sjtudyq/Fed Concat. |
| Researcher Affiliation | Academia | 1National University of Singapore 2University of California, Berkeley |
| Pseudocode | Yes | Algorithm 1: Fed Concat and Fed Concat-ID |
| Open Source Code | Yes | Our code is publicly available at https://github.com/sjtudyq/Fed Concat. |
| Open Datasets | Yes | Our experiments engage CIFAR-10 (Krizhevsky, Hinton et al. 2009), FMNIST (Xiao, Rasul, and Vollgraf 2017), SVHN (Netzer et al. 2011), CIFAR-100 (Krizhevsky, Hinton et al. 2009), and Tiny-Image Net datasets (Wu, Zhang, and Xu 2017) to evaluate our algorithm. |
| Dataset Splits | No | The paper mentions partitioning the dataset into clients and using a specific strategy for non-IID settings, but it does not explicitly state the train/validation/test dataset splits (e.g., percentages or counts) or reference a standard split that is widely known without requiring external lookup. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware used to run its experiments, such as GPU models, CPU types, or memory specifications. |
| Software Dependencies | No | The paper mentions using SGD optimizer but does not provide specific software dependencies with version numbers (e.g., Python version, deep learning framework version like PyTorch or TensorFlow, or specific library versions). |
| Experiment Setup | Yes | The baseline settings replicate those from Li et al. (2021), running 50 rounds with each client training 10 local epochs per round, batch size 64, and learning rate 0.01 using SGD optimizer with weight decay 10 5. By default, our configuration includes a division of the 40 clients into K = 5 clusters, and 200 rounds allocated for training the classifier. |