Personalized Federated Learning with Feature Alignment and Classifier Collaboration

Authors: Jian Xu, Xinyi Tong, Shao-Lun Huang

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, extensive evaluation results on benchmark datasets with various heterogeneous data scenarios demonstrate the effectiveness of our proposed method. and 6 EXPERIMENTS
Researcher Affiliation Academia Jian Xu, Xinyi Tong, Shao-Lun Huang Tsinghua Shenzhen International Graduate School, Tsinghua University
Pseudocode Yes Algorithm 1 Fed PAC
Open Source Code Yes Codes for the results in this paper are provided in the supplementary material.
Open Datasets Yes We consider image classification tasks and evaluate our method on four popular datasets: EMNIST with 62 categories of handwritten characters, Fashion-MNIST with 10 categories of clothes, CIFAR-10 and CINIC-10 with 10 categories of color images. We construct two different CNN models for EMNIST/Fashion-MNIST and CIFAR-10/CINIC-10, respectively. Details of datasets and model architectures are provided in Appendix B.
Dataset Splits Yes Similar to (Karimireddy et al., 2020b; Zhang et al., 2021b; Huang et al., 2021), we make all clients have the same data size, in which s% of data (20% by default) are uniformly sampled from all classes, and the remaining (100 s)% from a set of dominant classes for each client. We construct two experimental settings, where the number of global models is set as 3 and 5, respectively.
Hardware Specification Yes All experiments are implemented in Py Torch and simulated in NVIDIA Ge Force RTX 3090 GPUs.
Software Dependencies No The paper states 'All experiments are implemented in Py Torch' but does not specify the version of PyTorch or any other software dependencies like Python, CUDA, etc.
Experiment Setup Yes The step size η of local training is set to 0.01 for EMNIST/Fashion-MNIST, and 0.02 for CIFAR-10/CINIC-10. Notice that our method alternatively optimizes the feature extractor and the classifier. To reduce the local computational overhead, we only train the classifier for one epoch with a larger step size ηg = 0.1 for all experiments, and train the feature extractor for multiple epochs with the same step size ηf = η as other baselines. The weight decay is set to 5e-4 and the momentum is set to 0.5. The batch size is fixed to B = 50 for all datasets except EMNIST, where we set B = 100. The number of local training epochs is set to E = 5 for all federated learning approaches unless explicitly specified.