Bayesian Coreset Optimization for Personalized Federated Learning

Authors: Prateek Chanda, Shrey Modi, Ganesh Ramakrishnan

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments on different benchmark datasets based on a variety of recent personalized federated learning architectures show significant gains as compared to random sampling on the training data followed by federated learning, thereby indicating how intelligently selecting such training samples can help in performance. Additionally, through experiments on medical datasets our proposed method showcases some gains as compared to other submodular optimization based approaches used for subset selection on client s data.
Researcher Affiliation Academia Prateek Chanda Department of Computer Science Indian Institute of Technology Bombay, India prateekch@cse.iitb.ac.in Shrey Modi Department of Computer Science Indian Institute of Technology Bombay, India 200020135@iitb.ac.in Ganesh Ramakrishnan Department of Computer Science Indian Institute of Technology Bombay, India ganesh@cse.iitb.ac.in
Pseudocode Yes Algorithm 1: CORESET-PFEDBAYES; Algorithm 2 Accelerated IHT (A-IHT) for Bayesian Coreset Optimization
Open Source Code Yes We share our code on Git Hub at Link
Open Datasets Yes We generate the non-i.i.d. datasets based on three public benchmark datasets, MNIST (Lecun et al., 1998), FMNIST (Fashion MNIST) (Xiao et al., 2017) and CIFAR10 (Krizhevsky et al., 2009).
Dataset Splits No The paper mentions using MNIST, FMNIST, CIFAR-10, and medical datasets, and details about client data distribution and random subset selection (e.g., "randomly choose λ = 0.1 fraction of samples on the client side"), but it does not specify explicit train/validation/test splits with percentages or counts for reproducibility of the splits themselves.
Hardware Specification Yes All the experiments have been done using the following configuration: Nvidia RTX A4000(16GB) and Apple M2 Pro 10 cores and 16GB memory.
Software Dependencies No The paper mentions using the "Submodlib library" and other concepts related to software, but it does not provide specific version numbers for any software dependencies.
Experiment Setup Yes Learning rate hyperparameters: As per Zhang et al. (2022b) s proposal i.e. PFEDBAYES the learning rates for personalized (client model) and global model (η1, η2) are set to 0.001 since these choices result in the best setting for PFEDBAYES. ... Personalization Hyperparameter: The ζ parameter ... we fix the ζ parameter for our proposal CORESET-PFEDBAYES to the best setting given by the baseline. In Zhang et al. (2022b) the authors tune ζ {0.5, 1, 5, 10, 20} and find that ζ = 10 results in the best setting. We, therefore, fix the personalization parameter ζ = 10.