DisPFL: Towards Communication-Efficient Personalized Federated Learning via Decentralized Sparse Training

Authors: Rong Dai, Li Shen, Fengxiang He, Xinmei Tian, Dacheng Tao

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Comprehensive experiments demonstrate that Dis-PFL significantly saves the communication bottleneck for the busiest node among all clients and, at the same time, achieves higher model accuracy with less computation cost and communication rounds. Furthermore, we demonstrate that our method can easily adapt to heterogeneous local clients with varying computation complexities and achieves better personalized performances.
Researcher Affiliation Collaboration 1University of Science and Technology of China, Hefei, China 2JD Explore Academy, Beijing, China 3Institute of Artificial Intelligence, Hefei Comprehensive National Science Center, Hefei, China.
Pseudocode Yes Algorithm 1 Dis-PFL... Algorithm 2 Local mask searching
Open Source Code Yes Code is available at https://github.com/rong-dai/Dis PFL.
Open Datasets Yes We evaluate the performance of the proposed algorithm on three image classification datasets: CIFAR-10, CIFAR-100 (Krizhevsky et al., 2009) and Tiny-Imagenet.
Dataset Splits No The paper details training and testing data partitions ('We partition the training data according to a Dirichlet distribution Dir(α) for each client and generate the corresponding test data for each client following the same distribution.'), but it does not specify a separate validation dataset split with explicit percentages or counts.
Hardware Specification No The paper mentions 'edge devices' and 'heterogeneous clients' with varying computation capabilities but does not specify any particular hardware models (e.g., CPU, GPU) used for running the experiments.
Software Dependencies No The paper states 'We follow the pytorch s implementation of Res Net18... and VGG11...' and 'We use SGD optimizer...', but it does not provide specific version numbers for PyTorch or any other software libraries or dependencies.
Experiment Setup Yes The total client number is set to 100, and we restrict the busiest node to communicate with at most 10 neighbors. The sparsity of the local model is set 0.5 for all the clients in the main experiments... We use SGD optimizer for all methods with weighted decayed parameter 0.0005. For all the methods except Ditto, local epochs are fixed to 5. ...The learning rate is initialized with 0.1 and decayed with 0.998 after each communication round. The batch size is fixed to 128 for all the experiments. We run 500 global communication rounds for CIFAR-10, CIFAR-100, and 300 for Tiny-Imagenet.