FeDXL: Provable Federated Learning for Deep X-Risk Optimization
Authors: Zhishuai Guo, Rong Jin, Jiebo Luo, Tianbao Yang
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct empirical studies of Fe DXL for deep AUROC and partial AUROC maximization, and demonstrate their performance compared with several baselines. |
| Researcher Affiliation | Collaboration | 1Department of Computer Science and Engineering, Texas A&M University 2Alibaba 3Department of Computer Science, University of Rochester. |
| Pseudocode | Yes | Algorithm 1 Fe DXL1: FL for DXO with linear f and Algorithm 2 Fe DXL2: Federated Learning for DXO with non-linear f |
| Open Source Code | Yes | Code is released at https://github.com/Optimization-AI/ICML2023_FeDXL. |
| Open Datasets | Yes | We use four datasets: Cifar10, Cifar100 (Krizhevsky et al., 2009), Che Xpert (Irvin et al., 2019), and Chest MNIST (Yang et al., 2021a)... |
| Dataset Splits | Yes | For Cifar10 and Cifar100, we sample 20% of the training data as validation set... For Che Xpert, we consider the task of predicting Consolidation and use the last 1000 images in the training set as the validation set and use the original validation set as the testing set. For Chest MNIST, we consider the task of Mass prediction and use the provided train/valid/test split. |
| Hardware Specification | Yes | Each algorithm was run on 16 client machines connected by Infini Band where each machine uses a NVIDIA A100 GPU. |
| Software Dependencies | No | The paper mentions "We use the PyTorch framework (Paszke et al., 2019)," but it does not provide specific version numbers for PyTorch or any other software dependencies, which are necessary for reproducible descriptions. |
| Experiment Setup | Yes | We tune the initial step size in [1e-3, 1] using grid search and decay it by a factor of 0.1 every 5K iterations. All algorithms are run for 20k iterations. The mini-batch sizes B1, B2 (as in Step 11 of Fe DXL1 and Fe DXL2) are set to 32. The β parameter of Fe DXL2 (and corresponding Local Pair and Centralized method) is set to 0.1. ... For all the non-centralized algorithms, we set the communication interval K = 32 unless specified otherwise. |