Convergence Analysis of Split Federated Learning on Heterogeneous Data

Authors: Pengchao Han, Chao Huang, Geng Tian, Ming Tang, Xin Liu

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental experiments validate our theoretical results and show that SFL outperforms FL and split learning (SL) when data is highly heterogeneous across a large number of clients.
Researcher Affiliation Academia Pengchao Han Guangdong University of Technology, China hanpengchao@gdut.edu.cn Chao Huang Montclair State University, USA huangch@montclair.edu Geng Tian Southern University of Science and Technology, China 12332463@mail.sust.edu.cn Ming Tang Southern University of Science and Technology, China tangm3@sust.edu.cn Xin Liu University of California, Davis, USA xinliu@ucdavis.edu
Pseudocode Yes Algorithm 1: SFL-V1 under clients partial participation; Algorithm 2: SFL-V2 under clients partial participation
Open Source Code Yes Our codes are provided in https://github.com/TIANGeng708/ Convergence-Analysis-of-Split-Federated-Learning-on-Heterogeneous-Data.
Open Datasets Yes We conduct experiments on CIFAR-10 and CIFAR-100 [13]. More experiments on FEMNIST are given in Appendix I.5.
Dataset Splits No The paper mentions training parameters and local epochs but does not specify validation dataset splits (e.g., percentages or counts) or reference standard validation splits.
Hardware Specification Yes The experiments are run on a CPU (Intel(R) Xeon(R) Gold 5320 at 2.20GHz) and a GPU (A100-PCIE-80GB).
Software Dependencies No The paper mentions the use of ResNet-18 as a model structure, learning rates, and batch sizes, but it does not specify any software libraries or frameworks with their version numbers (e.g., PyTorch 1.x, TensorFlow 2.x, Python 3.x).
Experiment Setup Yes The learning rates for SFL-V1, SFL-V2, FL, and SL are set as 0.01. The batchsize bs is 128, and we run experiments for T = 200 rounds. Unless stated otherwise, we use N = 10, β = 0.1, E = 5, where E is the number of local epochs for client-side model aggregation (i.e., every E times of training performed over each client s dataset, their client-side models are aggregated at the fed server), and hence τ = Dn / bs E. We set τ = τ for the fair comparison to vanilla FL.