Federated Learning over Connected Modes

Authors: Dennis Grinwald, Philipp Wiesner, Shinichi Nakajima

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments show that FLOCO accelerates the global training process, and significantly improves the local accuracy with minimal computational overhead in cross-silo federated learning settings. Our experiments show that FLOCO outperforms common FL baselines (Fed Avg [1], Fed Prox [15]) and state-of-the-art personalized FL approaches (Fed Ro D [16], APFL [17], Ditto [18], Fed Per [19]) on both local and global test metrics without introducing significant computational overhead in cross-silo FL settings.
Researcher Affiliation Academia Dennis Grinwald1,2, Philipp Wiesner2, Shinichi Nakajima1,2,3 1BIFOLD, 2TU Berlin, 3RIKEN Center for AIP {dennis.grinwald, wiesner, nakajima}@tu-berlin.de
Pseudocode Yes Algorithm 1: Federated Learning over Connected Modes (FLOCO). Input :number of communication rounds T, number of clients K, simplex dimension M, subregion assignment round τ, subregion radius ρ
Open Source Code Yes Our code is publicly available: https://github.com/dennis-grinwald/floco.
Open Datasets Yes To evaluate our method, we perform image classification on the CIFAR10 [35] and FEMNIST [36] datasets.
Dataset Splits No The paper discusses training and testing data but does not explicitly mention a 'validation' set or split in the context of data partitioning.
Hardware Specification No Justification: We did not explicitly compute the resources needed.
Software Dependencies No The paper mentions FL frameworks FL-bench [20] and Flower [21] but does not specify their version numbers or other software dependencies with specific versions.
Experiment Setup Yes We provide a table with the training hyperparameters that we use for each dataset/model setting in Appendix B. For the baselines, we follow the recommended parameter settings by the authors, which are detailed in Appendix B. Table 6: Summary of used hyperparameters for training. Dataset/Model T K |St| e E/EDITTO γ mom. wd µ CIFAR-10/Cifar CNN 500 100 30 50 5 0.02 0.5 10 5 0.01 CIFAR-10/Res Net-18 100 100 10 32 5 0.01 0.9 10 4 0.01 FEMNIST/Femnist CNN 350 100 10 32 5 0.1 0.0 0.0 0.01 FEMNIST/Squeeze Net V1 1000 100 10 32 5 0.005 0.0 10 4 0.01