Personalized Federated Learning with Inferred Collaboration Graphs
Authors: Rui Ye, Zhenyang Ni, Fangzhao Wu, Siheng Chen, Yanfeng Wang
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments show that p Fed Graph consistently outperforms the other 14 baseline methods across various heterogeneity levels and multiple cases where malicious clients exist. Code will be available at https://github.com/MediaBrainSJTU/p Fed Graph. |
| Researcher Affiliation | Collaboration | 1Cooperative Medianet Innovation Center, Shanghai Jiao Tong University, Shanghai, China 2Microsoft Research Asia, Beijing, China 3Shanghai AI Laboratory, Shanghai, China. |
| Pseudocode | Yes | Algorithm 1 p Fed Graph |
| Open Source Code | Yes | Code will be available at https://github.com/MediaBrainSJTU/p Fed Graph. |
| Open Datasets | Yes | Datasets. Following most personalized FL literature (Li et al., 2021b; Collins et al., 2021; Marfoq et al., 2022), we conduct experiments on Fashion MNIST (Xiao et al., 2017), CIFAR-10 and CIFAR-100 (Krizhevsky et al., 2009) for image classification tasks. Beyond this, we also use Yahoo! Answers text classification dataset (Zhang et al., 2015)... |
| Dataset Splits | Yes | For each client, 20% of the training set is held out for validation. |
| Hardware Specification | No | The paper does not specify any particular CPU or GPU models, or other hardware details used for running the experiments. It mentions network architectures but not the hardware they ran on. |
| Software Dependencies | No | The paper mentions using SGD optimizer but does not provide specific version numbers for any software libraries, frameworks, or programming languages used. |
| Experiment Setup | Yes | We consider 50 communication rounds in total as personalized FL is easier to converge, where each client runs for τ = 200 iterations (Wang et al., 2020b). We use a simple CNN network with 3 convolutional layers and 3 fully-connected layers for image datasets (Li et al., 2021a). For text dataset, we use Text CNN (Zhang & Wallace, 2015; Zhu et al., 2020) model with 3 conv layers and a 256 dimension embedding layer. The optimizer used is SGD with learning rate 0.01 and a batch size 64. For p Fed Graph, we set λ = 0.01. For α, since the absolute magnitude will be affected by client number K, α should be set proportional to K. Generally, α = K 0.08 is a pleasant choice. We empirically calculate the similarity based on current model subtracted by initial global model (Huang et al., 2021; Sattler et al., 2020). We also empirically set the similarity values that is larger than 0.9 as 1.0. |