FedGTST: Boosting Global Transferability of Federated Models via Statistics Tuning

Authors: Evelyn Ma, Chao Pan, S. Rasoul Etesami, Han Zhao, Olgica Milenkovic

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirically, experiments on public benchmarks show that Fed GTST significantly outperforms other baselines, such as Fed SR. and 6 Experiments.
Researcher Affiliation Academia Evelyn Ma, Chao Pan, Rasoul Etesami, Han Zhao, Olgica Milenkovic University of Illinois Urbana-Champaign {pingm, chaopan2, etesami1, hanzhao, milenkov}@illinois.edu
Pseudocode Yes Algorithm 1 Fed GTST (Round p)
Open Source Code Yes Justification: The datasets are existing open-source benchmarks, and we have included our code in the Supplementary materials document.
Open Datasets Yes We investigate three transfer tasks utilizing fully-annotated data: a) MNIST [9] to MNIST-M [11], b) CIFAR-10 [19] to SVHN [33], and c) cross-domain transfer in Domain Net [37].
Dataset Splits No The paper describes how source (local) domains are constructed and used for pretraining, and a target domain is used for finetuning and evaluation. It mentions an 'early stop trigger of 10 rounds', which implies a validation process, but it does not explicitly specify exact training, validation, and test dataset splits (e.g., percentages or counts) for the source data used in the pretraining phase, nor does it explicitly name a 'validation set'.
Hardware Specification Yes We used an NVIDIA GeForce RTX 3090 Ti GPU with a memory of 24247Mi B.
Software Dependencies No The paper mentions using the Adam [17] optimizer, but does not specify version numbers for any software dependencies like programming languages (e.g., Python), deep learning frameworks (e.g., PyTorch, TensorFlow), or other libraries.
Experiment Setup Yes 1) Local epochs. To conform with our theoretical assumptions, unless specified otherwise, we set the local epoch of each client to 1. ... Our initial learning rate equals 0.01 and then decays by a factor of 10 per 50 rounds, with an early stop trigger of 10 rounds. We also set the pretraining batch size to 256 for MNIST MNIST-M and to 128 for CIFAR10 SVHN. For both tasks, we use a finetuned learning rate 0.005, with weight decay 0.0001.