PeFLL: Personalized Federated Learning by Learning to Learn

Authors: Jonathan Scott, Hossein Zakerinia, Christoph H Lampert

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section we report on our experimental evaluation. The values reported in every table and plot are given as the mean together with the standard deviation across three random seeds.
Researcher Affiliation Academia Jonathan Scott Institute of Science and Technology Austria (ISTA) jonathan.scott@ist.ac.at Hossein Zakerinia Institute of Science and Technology Austria (ISTA) hossein.zakerinia@ist.ac.at Christoph H. Lampert Institute of Science and Technology Austria (ISTA) chl@ist.ac.at
Pseudocode Yes Pseudocode of the specific steps is provided in Algorithms 1 and 2.
Open Source Code Yes We provide the code as supplemental material. We will publish it when the anonymity requirement is lifted.
Open Datasets Yes For our experiments, we use three datasets that are standard benchmarks for FL: CIFAR10/CIFAR100 (Krizhevsky, 2009) and FEMNIST (Caldas et al., 2018). [...] Additional experiments on the Shakespeare dataset (Caldas et al., 2018) are provided in Appendix A.
Dataset Splits Yes The hyperparameters for all methods are tuned using validation data that was held out from the training set (10,000 samples for CIFAR10 and CIFAR100, spread across the clients, and 10% of each client s data for FEMNIST).
Hardware Specification No The paper mentions support from 'Scientific Computing (Sci Comp)' and that a ResNet20 implementation was used, but does not provide specific details on the CPU, GPU, or memory used for experiments.
Software Dependencies No The paper mentions the use of 'SGD' as the optimizer and implies the use of a deep learning framework (e.g., PyTorch for ResNet implementation) but does not specify exact version numbers for any software dependencies.
Experiment Setup Yes We train all methods, except Local, for 5000 rounds with partial client participation. For CIFAR10 and CIFAR100 client participation is set to 5% per round... The optimizer used for training at the client is SGD with a batch size of 32, a learning rate chosen via grid search and momentum set to 0.9. The batch size used for computing the descriptor is also 32. [...] the dimension of the embedding vectors is l = n/4 and the number of client SGD steps is k = 50. The regularization parameters for the embedding network and hypernetwork are set to λh = λv = 10-3, while the output regularization is λθ = 0.