FedBN: Federated Learning on Non-IID Features via Local Batch Normalization
Authors: Xiaoxiao Li, Meirui JIANG, Xiaofei Zhang, Michael Kamp, Qi Dou
ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The resulting scheme, called Fed BN, outperforms both classical Fed Avg, as well as the state-of-the-art for non-iid data (Fed Prox) on our extensive experiments. These empirical results are supported by a convergence analysis that shows in a simplified setting that Fed BN has a faster convergence rate than Fed Avg. Code is available at https://github.com/med-air/Fed BN. |
| Researcher Affiliation | Academia | Xiaoxiao Li Department of Computer Science Princeton University xiaoxiao.li@aya.yale.edu Meirui Jiang Department of Computer Science and Engineering The Chinese University of Hong Kong mrjiang@cse.cuhk.edu.hk Xiaofei Zhang Department of Statistics Iowa State University xfzhang@iastate.edu Michael Kamp Dept of Data Science and AI, Faculty of IT Monash University michael.kamp@monash.edu Qi Dou Department of Computer Science and Engineering The Chinese University of Hong Kong qdou@cse.cuhk.edu.hk |
| Pseudocode | Yes | C FEDBN ALGORITHM We describe the details algorithm of our proposed Fed BN as following Algorithm 1: Algorithm 1 Federated Learning using Fed BN |
| Open Source Code | Yes | Code is available at https://github.com/med-air/Fed BN. |
| Open Datasets | Yes | Specifically, we use the following five datasets: SVHN Netzer et al. (2011), USPS Hull (1994), Synth Digits Ganin & Lempitsky (2015), MNIST-M Ganin & Lempitsky (2015) and MNIST Le Cun et al. (1998). [...] We conduct the classification task on natural images from Office Caltech10...Our second dataset is Domain Net...We include four medical institutions (NYU, USM, UM, UCLA; each is viewed as a client) from ABIDE I... |
| Dataset Splits | No | The paper mentions 'Testing samples are held out' and describes a '5-trial repeating experiment' for robust evaluation, but it does not explicitly state the use of a separate 'validation set' or 'validation split' for hyperparameter tuning. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware (e.g., GPU models, CPU types, memory) used to run its experiments. |
| Software Dependencies | No | The paper mentions several toolkits for integration (Pysyft, Google TFF, Flower, dlplatform, Fed ML) but does not provide specific version numbers for the software dependencies used in their own experiments (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | For model training, we use the cross-entropy loss and SGD optimizer with a learning rate of 10 2. If not specified, our default setting for local update epochs is E = 1, and the default setting for the amount of data at each client is 10% of the dataset original size. [...] During training process, we use SGD optimizer with learning rate 10 2 and cross-entropy loss, we set batch size to 32 and training epochs to 300. [...] For all the strategies, we set batch size as 100. The total training local epoch is 50 with learning rate 10 2 with SGD optimizer. Local update epoch for each client is E = 1. |