Balancing Similarity and Complementarity for Federated Learning

Authors: Kunda Yan, Sen Cui, Abudukelimu Wuerkaixi, Jingfeng Zhang, Bo Han, Gang Niu, Masashi Sugiyama, Changshui Zhang

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our comprehensive unimodal and multimodal experiments demonstrate that Fed Sa C markedly surpasses other state-of-the-art FL methods.
Researcher Affiliation Academia 1 Institute for Artificial Intelligence, Tsinghua University (THUAI), Beijing National Research Center for Information Science and Technology (BNRist), Department of Automation, Tsinghua University, Beijing, P.R.China 2The University of Auckland 3RIKEN 4Hong Kong Baptist University 5The University of Tokyo.
Pseudocode Yes Algorithm 1 Fed Sac
Open Source Code Yes Code is accessible at https://github. com/yankd22/Fed Sa C/.
Open Datasets Yes In our experiments, CIFAR-10, CIFAR-100 (Krizhevsky et al., 2009) and CUB200-2011 (Welinder et al., 2010) are all public dataset.
Dataset Splits No The paper specifies training and test set sizes for CIFAR-10/100 (50,000 training and 10,000 test images) and mentions using 'validation sets' for hyperparameter search, but does not explicitly provide the specific split percentages or counts for a distinct validation dataset that is needed to reproduce the experiment.
Hardware Specification Yes Part of the experiments is conducted on a local server with Ubuntu 16.04 system. It has two physical CPU chips which are Intel(R) Xeon(R) Gold 6248 CPU @ 2.50GHz with 20 cpu cores. The other experiments are conducted on a remote server. It has 8 GPUs which are Ge Force RTX 3090.
Software Dependencies No The paper mentions the operating system (Ubuntu 16.04) and optimizers (SGD, Adam) but does not provide specific version numbers for software dependencies or machine learning frameworks used in the experiments.
Experiment Setup Yes In the FL training phase, we execute 50 communication rounds. Each round consists of local training iterations that vary depending on the dataset: 200 iterations for CIFAR-10 and 400 iterations for CIFAR-100. Training employs the SGD optimizer with an initial learning rate 0.01 and a batch size 64. We use three eigenvectors (k = 3) for our representative subspace. The regularization hyperparameter λ is set at 1. Additionally, the hyperparameters α and β control the degree of complementarity and similarity in our optimization equation. In experiments, we consider two scenarios based on client dataset characteristics. For datasets with complementarity, α = 0.9 and β = 1.4 balance similarity and complementarity for enhanced performance. In contrast, for datasets lacking complementarity, such as in the Pathological partition, we reduce complementarity by setting α = 0.5 and β = 1.6.