FedFa: A Fully Asynchronous Training Paradigm for Federated Learning

Authors: Haotian Xu, Zhaorui Zhang, Sheng Di, Benben Liu, Khalid Ayed Alharthi, Jiannong Cao

IJCAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experimental results indicate our approach effectively improves the training performance of federated learning by up to 6 and 4 speedup compared to the state-of-the-art synchronous and semi-asynchronous strategies while retaining high accuracy in both IID and Non-IID scenarios.
Researcher Affiliation Academia 1The Hong Kong Polytechnic University, Hong Kong 2Argonne National Laboratory, USA 3The University of Hong Kong, Hong Kong 4Department of Computer Science, College Of Computing, University of Bisha, Bisha 61922, P.O. Box 551, Saudi Arabia.
Pseudocode Yes Algorithm 1: Fed Fa-Server; Algorithm 2: Fed Fa-Client
Open Source Code No The paper mentions implementing Fed Fa on top of Plota and PyTorch but does not provide any statement or link indicating that the source code for their proposed method is open-source or publicly available.
Open Datasets Yes For the CV task, we compare our method Fed Fa with other baselines on the most popular CV dataset, cifar10[K. and Geoffrey, 2009]. For the NLP task, we perform full-parameter fine-tuning on the sent140 [Caldas et al., 2018] task and on the STT2 task using the parameter-efficient fine-tuning method Lo RA [Hu et al., 2021].
Dataset Splits Yes To simulate the Non-IID environments, we partition the whole training data based on the Dirichlet Distribution and use a coefficient α to control the heterogeneity of the data distribution across clients [Hsu et al., 2019], where a small α represents higher data heterogeneity among each client, shown in Fig. 2. We subject the datasets Cifar10 and STT2 to the above division method that satisfies the Non-IID setting. The Sent140 is a dataset of sentiment categories collected from Twitter, which is naturally divided into federal settings by treating each user as a client, where we choose users with at least 100 samples. We deploy 100 clients and sample 10 clients for each communication round for Res Net18 trained on Cifar10.
Hardware Specification Yes Most of our experiments were performed on an NVIDIA Ge Force RTX 3090 graphic card. For the task of fine-tuning the large language model, we performed it on 2 NVIDIA Ge Force RTX 4090 graphic cards.
Software Dependencies No The paper mentions using Plota and Pytorch for implementation, but it does not specify version numbers for these or any other software dependencies.
Experiment Setup Yes Experiment setup. We deploy 100 clients and sample 10 clients for each communication round for Res Net18 trained on Cifar10. ... Hyperparameters. For the Res Net18 on Cifar10, the learning rate ηg is set as 0.01 with local epoch E = 10 and local mini-batches lb = 32. For the Tiny-Bert experiments on the Sent140 dataset, we set ηg = 0.0004, E = 15, lb = 5, which is inspired by [Cho et al., 2022]. For fine-tuning Bert on the STT2 dataset, we set ηg = 1e 4, E = 1, lb = 32. For the hyperparameter in Lo RA settings, we set r = 1,αLo RA = 1.