Self-Driven Entropy Aggregation for Byzantine-Robust Heterogeneous Federated Learning

Authors: Wenke Huang, Zekun Shi, Mang Ye, He Li, Bo Du

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirical results demonstrate the effectiveness. We conduct experiments on various heterogeneous federated scenarios (Krizhevsky & Hinton, 2009; Le Cun et al., 1998; Xiao et al., 2017), under different data-based and parameter-based attacks (Fang & Ye, 2022; Shi et al., 2022). Experimental results reveal that ours consistently achieves stronger robustness than others. The Fig. 4 plots the accuracy with popular Byzantine-robust aggregation methods and shows that ours performs significantly better than counterparts. We compare with several relative aggregation solutions, divided into three types. We conduct ablative studies to investigate the efficacy of essential components in Self Driven Entropy Aggregation (SDEA).
Researcher Affiliation Academia Wenke Huang * 1 Zekun Shi * 1 Mang Ye 1 2 He Li 1 Bo Du 1 *Equal contribution 1National Engineering Research Center for Multimedia Software, School of Computer Science, Wuhan University, Wuhan, China 2Taikang Center for Life and Medical Sciences, Wuhan University, Wuhan, China. Correspondence to: Mang Ye <yemang@whu.edu.cn>.
Pseudocode Yes Algorithm 1 Self-Driven Entropy Aggregation
Open Source Code No The paper does not provide an explicit statement or a link indicating that the source code for the described methodology is publicly available.
Open Datasets Yes Datasets. Following (Xie et al., 2022; Li et al., 2021), we evaluate efficacy and robustness on three scenarios. Cifar-10 (Krizhevsky & Hinton, 2009) contains 50k and 10k images with 32x32 for 10 classes. MNIST (Le Cun et al., 1998) is 10 classes with 70,000. Fashion-MNIST (Xiao et al., 2017) includes 60k training examples and 10k testing examples from 10 categories. Proxy/Public Data. ...USPS (Hull, 1994), SVHN (Netzer et al., 2011) and SYN (Roy et al., 2018) with same label space but different domain skew. Besides, our Self-Driven Entropy Aggregation supports random pubic data usage and we further utilize Tiny-Image Net (Russakovsky et al., 2015) and Market1501 (Zheng et al., 2015) datasets.
Dataset Splits Yes Cifar-10 (Krizhevsky & Hinton, 2009) contains 50k and 10k images with 32x32 for 10 classes. MNIST (Le Cun et al., 1998) is 10 classes with 70,000. Fashion-MNIST (Xiao et al., 2017) includes 60k training examples and 10k testing examples from 10 categories.
Hardware Specification Yes We fix the seed to ensure reproduction and conduct experiments on the NVIDIA 3090Ti.
Software Dependencies No The paper mentions using 'SGD' and 'Adam' as optimizers and 'Fed Prox' as the local optimization objective, but it does not specify version numbers for any software libraries, frameworks (e.g., PyTorch, TensorFlow), or other ancillary software components used for the experiments.
Experiment Setup Yes Training Setting: For a fair comparison, we follow (Li et al., 2020b; 2021; Mu et al., 2021). We configure the communication epoch T as 100 and 50 and all approaches have little or no accuracy gain with more communications. The participant scale K is 10, 20 for these two datasets. For local training, we leverage the Fed Prox(Li et al., 2020b) as the local optimization objective. The local updating round is 10 for different settings. We utilize the SGD as the local updating optimizer. The corresponding weight decay is 1e-5 and momentum is 0.9. The local learning rate is 0.01 for each client optimization in the above two scenarios. As for the learnable aggregation weight M optimization in SDEA, we set the public data batch size as 64, the optimizer as Adam (Kingma & Ba, 2014) with learning rate ηM as 0.005 and train it for E = 20 rounds.