Federated Optimization with Doubly Regularized Drift Correction
Authors: Xiaowen Jiang, Anton Rodomanov, Sebastian U Stich
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we illustrate the main theoretical properties of our studied methods in numerical experiments on both simulated and real datasets.Figure 1. Illustrating communication reduction for DANE+-GD and Fed Red-GD on synthetic dataset using quadratic loss with L δA L δB 20. |
| Researcher Affiliation | Academia | 1CISPA Helmholtz Center for Information Security, Saarbr ucken, Germany 2Universit at des Saarlandes, Saarbr ucken, Germany. |
| Pseudocode | Yes | Algorithm 1 DANE+; Algorithm 2 Fed Red: Federated optimization framework with doubly Regularized drift correction; Algorithm 3 Fed Red-(S)GD |
| Open Source Code | No | The paper does not contain any explicit statements about open-sourcing code or links to a code repository for the described methodology. |
| Open Datasets | Yes | Binary classification on LIBSVM datasets. We experiment with the binary classification task on four real-world LIBSVM datasets (Chang and Lin, 2011). |
| Dataset Splits | No | The paper states 'We use n = 5 and split the dataset according to the Dirichlet distribution.' This describes the splitting methodology but does not specify exact percentages, counts, or cite predefined train/validation/test splits. |
| Hardware Specification | No | The paper does not specify any hardware details (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers. |
| Experiment Setup | Yes | We set β = 0 for convex problems and β = 400 for the non-convex case. We further use n = 5, m = 10, and d = 1000. We use the constant probability ( 0.05) schedule for Fed Red GD. Lastly, we set the same step size for all three methods. We perform grid search to find the best hyper-parameters for each algorithm including the number of local steps and the stepsizes. |