Personalized Federated Learning using Hypernetworks

Authors: Aviv Shamsian, Aviv Navon, Ethan Fetaya, Gal Chechik

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We test p Fed HN empirically in several personalized federated learning challenges and find that it outperforms previous methods. Finally, since hypernetworks share information across clients, we show that p Fed HN can generalize better to new clients whose distributions differ from any client observed during training.
Researcher Affiliation Collaboration 1Bar-Ilan University, Ramat Gan, Israel 2Nvidia, Tel-Aviv, Israel. Correspondence to: Aviv Shamsian <aviv.shamsian@live.biu.ac.il>, Aviv Navon <aviv.navon@biu.ac.il>.
Pseudocode Yes Algorithm 1 Personalized Federated Hypernetwork
Open Source Code Yes We make our source code publicly available at: https: //github.com/Aviv Sham/p Fed HN.
Open Datasets Yes We evaluate p Fed HN in several learning setups using three common image-classification datasets: CIFAR10, CIFAR100, and Omniglot (Krizhevsky & Hinton, 2009; Lake et al., 2015).
Dataset Splits Yes For the CIFAR experiments, we pre-allocate 10, 000 training examples for validation. For the Omniglot dataset, we use a 70%/15%/15% split for train/validation/test sets.
Hardware Specification No The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments, only mentioning general experimental setup.
Software Dependencies No The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers like Python 3.8, CPLEX 12.4) needed to replicate the experiment.
Experiment Setup Yes We tune the hyperparameters of all methods using a pre-allocated held-out validation set. Full experimental details are provided in Appendix B. ... For all experiments presented in the main text, we use a fully-connected hypernetwork with 3 hidden layers of 100 hidden units each. For all relevant baselines, we aggregate over 5 clients at each round. We set K = 3 ,i.e., 60 local steps, for the p Fed Me algorithm...