SCAFFLSA: Taming Heterogeneity in Federated Linear Stochastic Approximation and TD Learning
Authors: Paul Mangold, Sergey Samsonov, Safwan Labbi, Ilya Levin, REDA ALAMI, Alexey Naumov, Eric Moulines
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this paper, we analyze the sample and communication complexity of the federated linear stochastic approximation (Fed LSA) algorithm. We explicitly quantify the effects of local training with agent heterogeneity. We show that the communication complexity of Fed LSA scales polynomially with the inverse of the desired accuracy ϵ. To overcome this, we propose SCAFFLSA, a new variant of Fed LSA that uses control variates to correct for client drift, and establish its sample and communication complexities. We show that for statistically heterogeneous agents, its communication complexity scales logarithmically with the desired accuracy, similar to Scaffnew [37]. An important finding is that, compared to the existing results for Scaffnew, the sample complexity scales with the inverse of the number of agents, a property referred to as linear speed-up. Achieving this linear speed-up requires completely new theoretical arguments. We apply the proposed method to federated temporal difference learning with linear function approximation and analyze the corresponding complexity improvements. ... In this section, we demonstrate the performance of Fed LSA and SCAFFLSA under varying levels of heterogeneity. We consider the Garnet problem [2, 16], with n = 30 states embedded in d = 8 dimensions, a = 2 actions, and each state is linked to b = 2 others in the transition kernel. We aim to estimate the value function of the policy which chooses actions uniformly at random, in homogeneous and heterogeneous setups. In all experiments, we initialize the algorithms in a neighborhood of the solution, allowing to observe both transient and stationary regimes. We provide all details regarding the experimental setup in Appendix G. |
| Researcher Affiliation | Collaboration | Paul Mangold CMAP, UMR 7641, École polytechnique; Sergey Samsonov HSE University, Russia; Safwan Labbi CMAP, UMR 7641, École polytechnique; Ilya Levin HSE University, Russia; Reda Alami Technology Innovation Institute, 9639 Masdar City, Abu Dhabi, United Arab Emirates; Alexey Naumov HSE University, Steklov Mathematical Institute of Russian Academy of Sciences; Eric Moulines CMAP, UMR 7641, École polytechnique MZBUAI |
| Pseudocode | Yes | Algorithm 1 Fed LSA; Algorithm 2 SCAFFLSA: Stochastic Controlled Fed LSA with deterministic communication; Algorithm 3 Fed LSA with Markovian data; Algorithm 4 Federated TD(0): Fed LSA applied to TD(0) with linear functional approximation; Algorithm 5 SCAFFTD(0): SCAFFLSA applied to TD(0) with linear functional approximation; Algorithm 6 "Scaffnew": Stochastic Controlled Fed LSA with probabilistic communication |
| Open Source Code | Yes | Our code is available either as supplementary material or online on Git Hub: https://github.com/pmangold/scafflsa. |
| Open Datasets | Yes | In this section, we demonstrate the performance of Fed LSA and SCAFFLSA under varying levels of heterogeneity. We consider the Garnet problem [2, 16], with n = 30 states embedded in d = 8 dimensions, a = 2 actions, and each state is linked to b = 2 others in the transition kernel. |
| Dataset Splits | No | No specific validation dataset splits or methodology were mentioned. The paper describes a total number of updates (TH = 500,000) for experiments. |
| Hardware Specification | No | All the experiments presented in this paper can be run on a single laptop in just a few hours. |
| Software Dependencies | No | The paper mentions that the code is available on GitHub but does not explicitly list specific software dependencies with version numbers (e.g., Python, PyTorch versions) within the text. |
| Experiment Setup | Yes | In Figures 1(a) to 1(d), we plot the MSE with N {10, 100}, H {10, 1000} and η = 0.1, with the same total number of updates TH = 500, 000. ... In all experiments, we initialize the algorithms in a neighborhood of the solution, allowing to observe both transient and stationary regimes. We provide all details regarding the experimental setup in Appendix G. |