Pursuing Overall Welfare in Federated Learning through Sequential Decision Making
Authors: Seok-Ju Hahn, Gi-Soo Kim, Junghye Lee
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments demonstrate that the federated system equipped with AAgg FF achieves better degree of client-level fairness than existing methods in both practical settings. and We evaluate AAgg FF on extensive benchmark datasets for realistic FL scenarios with other baselines. |
| Researcher Affiliation | Academia | 1Department of Industrial Enginnering, Ulsan National Institute of Science and Technology (UNIST), Ulsan, South Korea 2Artificial Intelligence Graduate School, Ulsan National Institute of Science and Technology (UNIST), Ulsan, South Korea 3Technology Management, Economics and Policy Program, Seoul National University (SNU), Seoul, South Korea 4Graduate School of Engineering Practice, Seoul National University (SNU), Seoul, South Korea 5Institute of Engineering Research, Seoul National University (SNU), Seoul, South Korea. |
| Pseudocode | Yes | D. Pseudocode for AAgg FF D.1. Pseudocode for Client Update Algorithm 1 Client Update D.2. Pseudocode for AAgg FF-S Algorithm 2 AAgg FF-S D.3. Pseudocode for AAgg FF-D Algorithm 3 AAgg FF-D |
| Open Source Code | Yes | Code is available at https://github.com/vaseline555/AAgg FF. |
| Open Datasets | Yes | For the cross-silo setting, we used Berka tabular dataset (Berka, 1999), MQP clinical text dataset (Mc Creery et al., 2020), and ISIC oncological image dataset (Codella et al., 2018) (also a part of FLamby benchmark (Ogier du Terrail et al., 2022)). For the crossdevice setting, we used Celeb A vision dataset (Liu et al., 2015), Reddit text dataset (both are parts of LEAF benchmark (Caldas et al., 2019)) and Speech Commands audio dataset (Warden, 2018). |
| Dataset Splits | Yes | Each client dataset is split into an 80% training set and a 20% test set in a stratified manner where applicable. |
| Hardware Specification | Yes | All experiments are conducted on a server with 2 Intel Xeon Gold 6226R CPUs (@ 2.90GHz) and 2 NVIDIA Tesla V100-PCIE-32GB GPUs. |
| Software Dependencies | No | All code is implemented in Py Torch (Paszke et al., 2019)... This only provides the software name with a reference to its original paper, not a specific version number. |
| Experiment Setup | Yes | All experiments are run with 3 different random seeds after tuning hyperparameters. For each dataset, a weight decay (L2 penalty) factor (ψ), a local learning rate (ζ), and variables related to a learning rate scheduling (i.e., learning rate decay factor (ϕ), and a decay step (s)) are tuned first with Fed Avg (Mc Mahan et al., 2017) as follows. Berka: ψ = 10 3, ζ = 100, ϕ = 0.99, s = 10 ... (and more specific values) |