Personalized Federated Learning with Inferred Collaboration Graphs

Authors: Rui Ye, Zhenyang Ni, Fangzhao Wu, Siheng Chen, Yanfeng Wang

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments show that p Fed Graph consistently outperforms the other 14 baseline methods across various heterogeneity levels and multiple cases where malicious clients exist. Code will be available at https://github.com/MediaBrainSJTU/p Fed Graph.
Researcher Affiliation Collaboration 1Cooperative Medianet Innovation Center, Shanghai Jiao Tong University, Shanghai, China 2Microsoft Research Asia, Beijing, China 3Shanghai AI Laboratory, Shanghai, China.
Pseudocode Yes Algorithm 1 p Fed Graph
Open Source Code Yes Code will be available at https://github.com/MediaBrainSJTU/p Fed Graph.
Open Datasets Yes Datasets. Following most personalized FL literature (Li et al., 2021b; Collins et al., 2021; Marfoq et al., 2022), we conduct experiments on Fashion MNIST (Xiao et al., 2017), CIFAR-10 and CIFAR-100 (Krizhevsky et al., 2009) for image classification tasks. Beyond this, we also use Yahoo! Answers text classification dataset (Zhang et al., 2015)...
Dataset Splits Yes For each client, 20% of the training set is held out for validation.
Hardware Specification No The paper does not specify any particular CPU or GPU models, or other hardware details used for running the experiments. It mentions network architectures but not the hardware they ran on.
Software Dependencies No The paper mentions using SGD optimizer but does not provide specific version numbers for any software libraries, frameworks, or programming languages used.
Experiment Setup Yes We consider 50 communication rounds in total as personalized FL is easier to converge, where each client runs for τ = 200 iterations (Wang et al., 2020b). We use a simple CNN network with 3 convolutional layers and 3 fully-connected layers for image datasets (Li et al., 2021a). For text dataset, we use Text CNN (Zhang & Wallace, 2015; Zhu et al., 2020) model with 3 conv layers and a 256 dimension embedding layer. The optimizer used is SGD with learning rate 0.01 and a batch size 64. For p Fed Graph, we set λ = 0.01. For α, since the absolute magnitude will be affected by client number K, α should be set proportional to K. Generally, α = K 0.08 is a pleasant choice. We empirically calculate the similarity based on current model subtracted by initial global model (Huang et al., 2021; Sattler et al., 2020). We also empirically set the similarity values that is larger than 0.9 as 1.0.