A Bayesian Approach for Personalized Federated Learning in Heterogeneous Settings

Authors: Disha Makhija, Joydeep Ghosh, Nhat Ho

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 5 Experiments In this section, we present an experimental evaluation of our method and compare it with different baselines under diverse homogeneous and heterogeneous client settings.
Researcher Affiliation Academia Disha Makhija Electrical and Computer Engineering University of Texas at Austin Austin, TX 78705 disham@utexas.edu Joydeep Ghosh Electrical and Computer Engineering University of Texas at Austin Austin, TX 778705 jghosh@utexas.edu Nhat Ho Statistics and Data Science University of Texas at Austin Austin, TX 78705 minhnhat@utexas.edu
Pseudocode Yes B Algorithm The pseudo-code of the algorithm used in the Fed BNN method is included in the Algorithm 1.
Open Source Code No The code will be released on acceptance of the paper.
Open Datasets Yes Datasets We choose three different datasets commonly used in prior federated learning works from the popular FL benchmark, LEAF [13] including MNIST, CIFAR-10 and CIFAR-100.
Dataset Splits Yes MNIST contains 10 different classes corresponding to the 10 digits with 50,000 28 28 black and white train images and 10,000 images for validation.
Hardware Specification Yes All the models are trained on a 4 GPU machine with Ge Force RTX 3090 GPUs and 24GB per GPU memory.
Software Dependencies No The paper mentions optimizers (Adam) and methods (Bayes by Backprop) but does not provide specific version numbers for software dependencies or libraries.
Experiment Setup Yes The number of local epochs is set to 20 and the size of AD is kept as 2000. ... for optimizing the prior parameters at each client according to Equation 4, we use an Adam optimizer with learning rate=0.0001 and run the prior optimization procedure for 100 steps. Then with the optimized prior we train the local BNN using Bayes-by-Backprop, with Adam optimizer, learning rate = 0.001 and batch size = 128. The noise effect γ is selected after fine-tuning and kept to be 0.7.