Personalized Federated Learning with First Order Model Optimization

Authors: Michael Zhang, Karan Sapra, Sanja Fidler, Serena Yeung, Jose M. Alvarez

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate and characterize our method on a variety of federated settings, datasets, and degrees of local data heterogeneity. Our method outperforms existing alternatives, while also enabling new features for personalized FL such as transfer outside of local data distributions. ... 4 EXPERIMENTS
Researcher Affiliation Collaboration Michael Zhang Stanford University mzhang@cs.stanford.edu Karan Sapra NVIDIA ksapra@nvidia.com Sanja Fidler NVIDIA sfidler@nvidia.com Serena Yeung Stanford University syyeung@stanford.edu Jose M. Alvarez NVIDIA josea@nvidia.com
Pseudocode No The paper describes the FOMO update mathematically and intuitively but does not provide a formal pseudocode block or algorithm table.
Open Source Code No The paper does not contain any explicit statement from the authors that they are releasing their code for the work described in this paper, nor does it provide a direct link to their repository. Footnote 1 provides links to implementations of baseline methods, not their own.
Open Datasets Yes Based on prior work (Mc Mahan et al., 2016; Liang et al., 2020), we compare methods with the MNIST (Le Cun et al., 1998), CIFAR-10 (Krizhevsky et al., 2009), and CIFAR-100 datasets.
Dataset Splits Yes We use the 15 client 100% participation setup with 5 latent distributions organized over the CIFAR-10 dataset, and consider both the evaluation curve and final test accuracy over allocating a fraction {0.01, 0.05, 0.1, 0.2, 0.4, 0.6, 0.8, 0.9} of all clients local data to Dval, and track evaluation over 20 communication rounds with 5 epochs of local training per round. On average, each client has 3333 local data points. ... and separate Dtrain and Dval with an 80-20 split.
Hardware Specification No The paper does not specify any particular CPU, GPU model, or other hardware used for running the experiments.
Software Dependencies No The paper mentions 'Opacus Pytorch library' but does not provide specific version numbers for Python, PyTorch, Opacus, or any other critical software dependencies for reproducibility.
Experiment Setup Yes All accuracies are reported with mean and standard deviation over three runs, with local training epochs E = 5, the same number of communication rounds (20 for 15 clients, 100% participation; 100 for 100 clients, 10% participation) and learning rate 0.01 for MNIST, 0.1 for CIFAR-10). ... We train with SGD, 0.1 learning rate, 0 momentum, 1e-4 weight decay, and 0.99 learning rate decay for CIFAR-10/100, and do the same except with 0.01 learning rate for MNIST. For Fed Fomo we use n = 5 and n = 10 downloads per client, ε = 0.3 with 0.05 decay each round, and separate Dtrain and Dval with an 80-20 split.