FedAvP: Augment Local Data via Shared Policy in Federated Learning
Authors: Minui Hong, Junhyeog Yun, Insu Jeon, Gunhee Kim
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In the experiments, our Fed Av P demonstrates superior performance on CIFAR-10/100 [21], SVHN [22], and FEMNIST [23] datasets within an FL context, compared to existing federated learning algorithms, including Fed Avg [2], Fed Prox [4], Fed Dyn [5], Fed Ex P [6], and federated data augmentation algorithms, including Fed Gen [9], Fed Mix [8], and Fed FA [11]. |
| Researcher Affiliation | Academia | Minui Hong Junhyeog Yun Insu Jeon Gunhee Kim Seoul National University, Seoul, South Korea {alsdml123,junhyeog,gunhee}@snu.ac.kr {insuj3on}@gmail.com |
| Pseudocode | Yes | Algorithm 1 Fed Av P: Joint Training |
| Open Source Code | Yes | Our code is available at https://github.com/alsdml/Fed Av P. |
| Open Datasets | Yes | In the experiments, our Fed Av P demonstrates superior performance on CIFAR-10/100 [21], SVHN [22], and FEMNIST [23] datasets within an FL context... |
| Dataset Splits | Yes | We assign the data to 130 clients based on a Dirichlet distribution with different hyperparameters of α = [5.0, 0.1], as done in p FLBench [25]. The smaller α is, the higher the degree of heterogeneity is. Among these clients, only 100 randomly selected clients participate in the training, while the remaining 30 are nominated as out-of-distribution (OOD) clients. |
| Hardware Specification | Yes | All experiments are run on a cluster of 32 NVIDIA GTX 1080 GPUs. |
| Software Dependencies | No | We utilized both the Tree-structured Parzen Estimator algorithm and Random Sampler as hyperparameter samplers within Optuna. We leveraged a Py Torch implementation [47]. We used the Adam optimizer [49]. |
| Experiment Setup | Yes | For the Fed Av P algorithm, the hyperparameters include the server policy learning rate η, client policy learning rate λ, gradient clipping threshold c, and a regularization term ϵ. In our experimentation, we tuned η within [0.4, 0.9], λ within [0.1 0.9], c within [0.4 1.0], and ϵ within [0.0 0.5]. The validation batch size was also explored within [64, 128, 192]. A common hyperparameter across all methods was local epoch set to 5, and local batch is set to 64. The client model learning rate γ was searched within the range of [0.1 0.3]. |