The Sample-Communication Complexity Trade-off in Federated Q-Learning
Authors: Sudeep Salgia, Yuejie Chi
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We perform three empirical studies. In the first study, we compare the proposed algorithm Fed-DVRQ to the Fed-Syn Q algorithm proposed in Woo et al. [2023]. We consider a Federated Q-learning setting with 5 agents. The parameters for both the algorithms were set to the suggested values in the respective papers. Both the algorithms were run with 10^7 samples at each agent. |
| Researcher Affiliation | Academia | Sudeep Salgia Carnegie Mellon University ssalgia@andrew.cmu.edu Yuejie Chi Carnegie Mellon University yuejiechi@cmu.edu |
| Pseudocode | Yes | Algorithm 1: A generic algorithm A, Algorithm 2: Fed-DVR-Q, Algorithm 3: REFINEESTIMATE(Q, B, I, L, D, J) |
| Open Source Code | No | The paper does not have associated code or data. (From NeurIPS Paper Checklist, Question 5) |
| Open Datasets | No | In this work, we consider the synchronous setting, where each agent has access to an independent generative model or simulator from which they can draw independent samples from the unknown underlying distribution P( · |s, a) for each state-action pair (s, a) [Kearns and Singh, 1998]. |
| Dataset Splits | No | The paper focuses on theoretical sample and communication complexity with a generative model, and does not describe explicit training, validation, or test dataset splits in the context of empirical evaluation. |
| Hardware Specification | No | The empirical studies require no specific compute resources can be easily completed on a regular laptop. (From NeurIPS Paper Checklist, Question 8) |
| Software Dependencies | No | The paper refers to specific algorithms like Fed-DVR-Q and Fed-Syn Q, and general parameters (e.g., number of samples, error rates), but does not specify software dependencies with version numbers. |
| Experiment Setup | Yes | We consider a Federated Q-learning setting with 5 agents. The parameters for both the algorithms were set to the suggested values in the respective papers. Both the algorithms were run with 10^7 samples at each agent...We vary the number of agents from 5 to 25 in multiples of 5 and record the sample and communication complexity to achieve an error rate of ε = 0.03...We run the algorithm to achieve an accuracy of ε = 0.1 with parameter choices prescribed in Sec. 4.1.3. |