Global Convergence of Federated Learning for Mixed Regression
Authors: Lili Su, Jiaming Xu, Pengkun Yang
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | This paper studies the problem of model training under Federated Learning when clients exhibit cluster structure. We contextualize this problem in mixed regression, where each client has limited local data generated from one of k unknown regression models. We design an algorithm that achieves global convergence from any initialization, and works even when local data volume is highly unbalanced there could exist clients that contain O(1) data points only. Our algorithm first runs moment descent on a few anchor clients (each with (k) data points) to obtain coarse model estimates. Then each client alternately estimates its cluster labels and refines the model estimates based on Fed Avg or Fed Prox. A key innovation in our analysis is a uniform estimate on the clustering errors, which we prove by bounding the VC dimension of general polynomial concept classes based on the theory of algebraic geometry. |
| Researcher Affiliation | Academia | Lili Su Electrical and Computer Engineering Northeastern University l.su@northeastern.edu Jiaming Xu The Fuqua School of Business Duke University jiaming.xu868@duke.edu Pengkun Yang Center for Statistical Science Tsinghua University yangpengkun@tsinghua.cn |
| Pseudocode | Yes | Phase 1: Federated Moment Descent (Fed MD) and Phase 2: Fed X+clustering are detailed in the pseudocode in Phase 1 and Phase 2, respectively. |
| Open Source Code | No | The paper is theoretical and does not provide an explicit statement about releasing source code for the methodology described. In the 'Questions for Paper Analysis' section, it states '[N/A]' for questions regarding code availability. |
| Open Datasets | No | The paper is theoretical and does not conduct experiments involving datasets. Therefore, it does not provide concrete access information for a publicly available or open dataset. In the 'Questions for Paper Analysis' section, it states '[N/A]' for questions regarding data. |
| Dataset Splits | No | The paper is theoretical and does not conduct experiments, hence it does not specify dataset splits for training, validation, or testing. In the 'Questions for Paper Analysis' section, it states '[N/A]' for questions regarding training details like data splits. |
| Hardware Specification | No | The paper is theoretical and does not conduct experiments, thus no hardware specifications for running experiments are provided. In the 'Questions for Paper Analysis' section, it states '[N/A]' for questions regarding compute and resources used. |
| Software Dependencies | No | The paper is theoretical and does not conduct experiments, thus no specific software dependencies with version numbers for replication are provided. In the 'Questions for Paper Analysis' section, it states '[N/A]' for questions regarding compute and resources used. |
| Experiment Setup | No | The paper is theoretical and does not conduct experiments, hence no specific experimental setup details like hyperparameters or training configurations are provided. In the 'Questions for Paper Analysis' section, it states '[N/A]' for questions regarding training details. |