DFRD: Data-Free Robustness Distillation for Heterogeneous Federated Learning

Authors: kangyang Luo, Shuai Wang, Yexuan Fu, Xiang Li, Yunshi Lan, Ming Gao

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our extensive experiments on various image classification tasks illustrate that DFRD achieves significant performance gains compared to SOTA baselines. Our code is here: https://anonymous.4open.science/r/DFRD-0C83/.
Researcher Affiliation Academia Kangyang Luo1, Shuai Wang1, Yexuan Fu1, Xiang Li1 , Yunshi Lan1, Ming Gao1,2 School of Data Science & Engineering1 KLATASDS-MOE in School of Statistics2 East China Normal University Shanghai, China
Pseudocode Yes Moreover, we present pseudocode for DFRD in Appendix C.
Open Source Code Yes Our code is here: https://anonymous.4open.science/r/DFRD-0C83/.
Open Datasets Yes In this paper, we evaluate different methods with six real-world image classification task-related datasets, namely Fashion-MNIST [69] (FMNIST in short), SVHN [70], CIFAR-10, CIFAR-100 [71], Tiny-image Net3 and Food101 [73]. We detail the six datasets in Appendix B. 3http://cs231n.stanford.edu/tiny-imagenet-200.zip
Dataset Splits No The paper mentions partitioning the training set for clients and using a test set, but does not provide specific details on train/validation splits or their percentages/counts.
Hardware Specification No The paper does not provide specific details about the hardware used for running the experiments, such as GPU models, CPU types, or memory specifications.
Software Dependencies No The paper does not provide specific version numbers for software dependencies or libraries used in the experiments.
Experiment Setup Yes Unless otherwise specified, all experiments are performed on a centralized network with N = 10 active clients. We set ω {0.01, 0.1, 1.0} to mimic different data heterogeneity scenarios... We fix σ = 4 and consider ρ {5, 10, 40}... Unless otherwise specified, we set βtran and βdiv both to 1 in training generator, while in robust model distillation, we set λ = 0.5 and α = 0.5.