The Sample-Communication Complexity Trade-off in Federated Q-Learning

Authors: Sudeep Salgia, Yuejie Chi

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We perform three empirical studies. In the first study, we compare the proposed algorithm Fed-DVRQ to the Fed-Syn Q algorithm proposed in Woo et al. [2023]. We consider a Federated Q-learning setting with 5 agents. The parameters for both the algorithms were set to the suggested values in the respective papers. Both the algorithms were run with 10^7 samples at each agent.
Researcher Affiliation Academia Sudeep Salgia Carnegie Mellon University ssalgia@andrew.cmu.edu Yuejie Chi Carnegie Mellon University yuejiechi@cmu.edu
Pseudocode Yes Algorithm 1: A generic algorithm A, Algorithm 2: Fed-DVR-Q, Algorithm 3: REFINEESTIMATE(Q, B, I, L, D, J)
Open Source Code No The paper does not have associated code or data. (From NeurIPS Paper Checklist, Question 5)
Open Datasets No In this work, we consider the synchronous setting, where each agent has access to an independent generative model or simulator from which they can draw independent samples from the unknown underlying distribution P( · |s, a) for each state-action pair (s, a) [Kearns and Singh, 1998].
Dataset Splits No The paper focuses on theoretical sample and communication complexity with a generative model, and does not describe explicit training, validation, or test dataset splits in the context of empirical evaluation.
Hardware Specification No The empirical studies require no specific compute resources can be easily completed on a regular laptop. (From NeurIPS Paper Checklist, Question 8)
Software Dependencies No The paper refers to specific algorithms like Fed-DVR-Q and Fed-Syn Q, and general parameters (e.g., number of samples, error rates), but does not specify software dependencies with version numbers.
Experiment Setup Yes We consider a Federated Q-learning setting with 5 agents. The parameters for both the algorithms were set to the suggested values in the respective papers. Both the algorithms were run with 10^7 samples at each agent...We vary the number of agents from 5 to 25 in multiples of 5 and record the sample and communication complexity to achieve an error rate of ε = 0.03...We run the algorithm to achieve an accuracy of ε = 0.1 with parameter choices prescribed in Sec. 4.1.3.