FedGTST: Boosting Global Transferability of Federated Models via Statistics Tuning
Authors: Evelyn Ma, Chao Pan, S. Rasoul Etesami, Han Zhao, Olgica Milenkovic
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirically, experiments on public benchmarks show that Fed GTST significantly outperforms other baselines, such as Fed SR. and 6 Experiments. |
| Researcher Affiliation | Academia | Evelyn Ma, Chao Pan, Rasoul Etesami, Han Zhao, Olgica Milenkovic University of Illinois Urbana-Champaign {pingm, chaopan2, etesami1, hanzhao, milenkov}@illinois.edu |
| Pseudocode | Yes | Algorithm 1 Fed GTST (Round p) |
| Open Source Code | Yes | Justification: The datasets are existing open-source benchmarks, and we have included our code in the Supplementary materials document. |
| Open Datasets | Yes | We investigate three transfer tasks utilizing fully-annotated data: a) MNIST [9] to MNIST-M [11], b) CIFAR-10 [19] to SVHN [33], and c) cross-domain transfer in Domain Net [37]. |
| Dataset Splits | No | The paper describes how source (local) domains are constructed and used for pretraining, and a target domain is used for finetuning and evaluation. It mentions an 'early stop trigger of 10 rounds', which implies a validation process, but it does not explicitly specify exact training, validation, and test dataset splits (e.g., percentages or counts) for the source data used in the pretraining phase, nor does it explicitly name a 'validation set'. |
| Hardware Specification | Yes | We used an NVIDIA GeForce RTX 3090 Ti GPU with a memory of 24247Mi B. |
| Software Dependencies | No | The paper mentions using the Adam [17] optimizer, but does not specify version numbers for any software dependencies like programming languages (e.g., Python), deep learning frameworks (e.g., PyTorch, TensorFlow), or other libraries. |
| Experiment Setup | Yes | 1) Local epochs. To conform with our theoretical assumptions, unless specified otherwise, we set the local epoch of each client to 1. ... Our initial learning rate equals 0.01 and then decays by a factor of 10 per 50 rounds, with an early stop trigger of 10 rounds. We also set the pretraining batch size to 256 for MNIST MNIST-M and to 128 for CIFAR10 SVHN. For both tasks, we use a finetuned learning rate 0.005, with weight decay 0.0001. |