FedAvP: Augment Local Data via Shared Policy in Federated Learning

Authors: Minui Hong, Junhyeog Yun, Insu Jeon, Gunhee Kim

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In the experiments, our Fed Av P demonstrates superior performance on CIFAR-10/100 [21], SVHN [22], and FEMNIST [23] datasets within an FL context, compared to existing federated learning algorithms, including Fed Avg [2], Fed Prox [4], Fed Dyn [5], Fed Ex P [6], and federated data augmentation algorithms, including Fed Gen [9], Fed Mix [8], and Fed FA [11].
Researcher Affiliation Academia Minui Hong Junhyeog Yun Insu Jeon Gunhee Kim Seoul National University, Seoul, South Korea {alsdml123,junhyeog,gunhee}@snu.ac.kr {insuj3on}@gmail.com
Pseudocode Yes Algorithm 1 Fed Av P: Joint Training
Open Source Code Yes Our code is available at https://github.com/alsdml/Fed Av P.
Open Datasets Yes In the experiments, our Fed Av P demonstrates superior performance on CIFAR-10/100 [21], SVHN [22], and FEMNIST [23] datasets within an FL context...
Dataset Splits Yes We assign the data to 130 clients based on a Dirichlet distribution with different hyperparameters of α = [5.0, 0.1], as done in p FLBench [25]. The smaller α is, the higher the degree of heterogeneity is. Among these clients, only 100 randomly selected clients participate in the training, while the remaining 30 are nominated as out-of-distribution (OOD) clients.
Hardware Specification Yes All experiments are run on a cluster of 32 NVIDIA GTX 1080 GPUs.
Software Dependencies No We utilized both the Tree-structured Parzen Estimator algorithm and Random Sampler as hyperparameter samplers within Optuna. We leveraged a Py Torch implementation [47]. We used the Adam optimizer [49].
Experiment Setup Yes For the Fed Av P algorithm, the hyperparameters include the server policy learning rate η, client policy learning rate λ, gradient clipping threshold c, and a regularization term ϵ. In our experimentation, we tuned η within [0.4, 0.9], λ within [0.1 0.9], c within [0.4 1.0], and ϵ within [0.0 0.5]. The validation batch size was also explored within [64, 128, 192]. A common hyperparameter across all methods was local epoch set to 5, and local batch is set to 64. The client model learning rate γ was searched within the range of [0.1 0.3].