Personalized Federated Learning using Hypernetworks
Authors: Aviv Shamsian, Aviv Navon, Ethan Fetaya, Gal Chechik
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We test p Fed HN empirically in several personalized federated learning challenges and find that it outperforms previous methods. Finally, since hypernetworks share information across clients, we show that p Fed HN can generalize better to new clients whose distributions differ from any client observed during training. |
| Researcher Affiliation | Collaboration | 1Bar-Ilan University, Ramat Gan, Israel 2Nvidia, Tel-Aviv, Israel. Correspondence to: Aviv Shamsian <aviv.shamsian@live.biu.ac.il>, Aviv Navon <aviv.navon@biu.ac.il>. |
| Pseudocode | Yes | Algorithm 1 Personalized Federated Hypernetwork |
| Open Source Code | Yes | We make our source code publicly available at: https: //github.com/Aviv Sham/p Fed HN. |
| Open Datasets | Yes | We evaluate p Fed HN in several learning setups using three common image-classification datasets: CIFAR10, CIFAR100, and Omniglot (Krizhevsky & Hinton, 2009; Lake et al., 2015). |
| Dataset Splits | Yes | For the CIFAR experiments, we pre-allocate 10, 000 training examples for validation. For the Omniglot dataset, we use a 70%/15%/15% split for train/validation/test sets. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments, only mentioning general experimental setup. |
| Software Dependencies | No | The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers like Python 3.8, CPLEX 12.4) needed to replicate the experiment. |
| Experiment Setup | Yes | We tune the hyperparameters of all methods using a pre-allocated held-out validation set. Full experimental details are provided in Appendix B. ... For all experiments presented in the main text, we use a fully-connected hypernetwork with 3 hidden layers of 100 hidden units each. For all relevant baselines, we aggregate over 5 clients at each round. We set K = 3 ,i.e., 60 local steps, for the p Fed Me algorithm... |