reproducibilityindex.ai

FedFed: Feature Distillation against Data Heterogeneity in Federated Learning

Authors: Zhiqin Yang, Yonggang Zhang, Yu Zheng, Xinmei Tian, Hao Peng, Tongliang Liu, Bo Han

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Comprehensive experiments demonstrate the efficacy of Fed Fed in promoting model performance. The code is publicly available at: https://github.com/tmlr-group/Fed Fed. We deploy Fed Fed on four popular FL algorithms, including Fed Avg [4], Fed Prox [6], SCAFFOLD [7], and Fed Nova [22]. Atop them, we conduct comprehensive experiments on various scenarios regarding different amounts of clients, varying degrees of heterogeneity, and four datasets. Extensive results show that the Fed Fed achieves considerable performance gains in all settings. Our contributions are summarized as follows: 3. We conduct comprehensive experiments to show that Fed Fed consistently and significantly enhances the convergence rate and generalization performance of FL models across different scenarios under various datasets (Sec 4.2).
Researcher Affiliation	Academia	Zhiqin Yang1,2 Yonggang Zhang2 Yu Zheng3 Xinmei Tian5 Hao Peng1,6 Tongliang Liu4 Bo Han2 1Beihang University 2Hong Kong Baptist University 3Chinese University of Hong Kong 4Sydney AI Centre, The University of Sydney 5University of Science and Technology of China 6 Kunming University of Science and Technology
Pseudocode	Yes	Algorithm 1 summarizes the procedure of feature distillation. Pseudo-code of how to apply Fed Fed are listed in Appendix B. Algorithm 2 Fed Avg/Fed Prox with Fed Fed. Algorithm 3 SCAFFOLD with Fed Fed. Algorithm 4 Fed Nova with Fed Fed.
Open Source Code	Yes	The code is publicly available at: https://github.com/tmlr-group/Fed Fed
Open Datasets	Yes	Following previous works [10, 29], we conduct experiments over CIFAR-10, CIFAR100 [30], Fashion-MNIST(FMNIST) [31], and SVHN [32]. Following [5], we employ latent Dirichlet sampling (LDA) [33] to simulate Non-IID distribution.
Dataset Splits	No	The paper mentions setting up Non-IID distributions with specific alpha values for datasets and notes batch sizes and local epochs. However, it does not provide explicit training/validation/test dataset splits (e.g., percentages or counts) or refer to standard predefined splits for reproducibility beyond the dataset names themselves.
Hardware Specification	Yes	Besides, all experiments are performed on Python 3.8, 36 core 3.00GHz Intel Core i9 CPU, and NVIDIA RTX A6000 GPUs.
Software Dependencies	No	The paper mentions "Python 3.8" but does not list specific versions for other key software components, libraries, or frameworks (e.g., PyTorch, TensorFlow) that would be essential for reproducibility.
Experiment Setup	Yes	We use Res Net-18 [35] both in the feature distillation and classifier in FL. Table 7: The values of all parameters in this paper. Federated Learning Relevant: α 0.1/0.05 heterogeneity degree, Td 15 communication round of feature distillation, Tr 1,000 communication round of classifier training, Ed 1 local epochs of feature distillation, E/Er 1/5 local epochs of classifier training, σ2 s 0.15 DP noise level, added to xs, \|Ct\|/\|Cr\| 5/10 #selected clients every communication round, K 10/100 #clients of federated system. Training Process Relevant: ηk 0.01/0.001/0.0001 learning rate, B 32/64 batch size, M 0.9 momentum, wd 0.0001 weight decay for regularization.