Personalized Federated Learning with First Order Model Optimization
Authors: Michael Zhang, Karan Sapra, Sanja Fidler, Serena Yeung, Jose M. Alvarez
ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate and characterize our method on a variety of federated settings, datasets, and degrees of local data heterogeneity. Our method outperforms existing alternatives, while also enabling new features for personalized FL such as transfer outside of local data distributions. ... 4 EXPERIMENTS |
| Researcher Affiliation | Collaboration | Michael Zhang Stanford University mzhang@cs.stanford.edu Karan Sapra NVIDIA ksapra@nvidia.com Sanja Fidler NVIDIA sfidler@nvidia.com Serena Yeung Stanford University syyeung@stanford.edu Jose M. Alvarez NVIDIA josea@nvidia.com |
| Pseudocode | No | The paper describes the FOMO update mathematically and intuitively but does not provide a formal pseudocode block or algorithm table. |
| Open Source Code | No | The paper does not contain any explicit statement from the authors that they are releasing their code for the work described in this paper, nor does it provide a direct link to their repository. Footnote 1 provides links to implementations of baseline methods, not their own. |
| Open Datasets | Yes | Based on prior work (Mc Mahan et al., 2016; Liang et al., 2020), we compare methods with the MNIST (Le Cun et al., 1998), CIFAR-10 (Krizhevsky et al., 2009), and CIFAR-100 datasets. |
| Dataset Splits | Yes | We use the 15 client 100% participation setup with 5 latent distributions organized over the CIFAR-10 dataset, and consider both the evaluation curve and final test accuracy over allocating a fraction {0.01, 0.05, 0.1, 0.2, 0.4, 0.6, 0.8, 0.9} of all clients local data to Dval, and track evaluation over 20 communication rounds with 5 epochs of local training per round. On average, each client has 3333 local data points. ... and separate Dtrain and Dval with an 80-20 split. |
| Hardware Specification | No | The paper does not specify any particular CPU, GPU model, or other hardware used for running the experiments. |
| Software Dependencies | No | The paper mentions 'Opacus Pytorch library' but does not provide specific version numbers for Python, PyTorch, Opacus, or any other critical software dependencies for reproducibility. |
| Experiment Setup | Yes | All accuracies are reported with mean and standard deviation over three runs, with local training epochs E = 5, the same number of communication rounds (20 for 15 clients, 100% participation; 100 for 100 clients, 10% participation) and learning rate 0.01 for MNIST, 0.1 for CIFAR-10). ... We train with SGD, 0.1 learning rate, 0 momentum, 1e-4 weight decay, and 0.99 learning rate decay for CIFAR-10/100, and do the same except with 0.01 learning rate for MNIST. For Fed Fomo we use n = 5 and n = 10 downloads per client, ε = 0.3 with 0.05 decay each round, and separate Dtrain and Dval with an 80-20 split. |