reproducibilityindex.ai

The Sample-Communication Complexity Trade-off in Federated Q-Learning

Authors: Sudeep Salgia, Yuejie Chi

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We perform three empirical studies. In the first study, we compare the proposed algorithm Fed-DVRQ to the Fed-Syn Q algorithm proposed in Woo et al. [2023]. We consider a Federated Q-learning setting with 5 agents. The parameters for both the algorithms were set to the suggested values in the respective papers. Both the algorithms were run with 10^7 samples at each agent.
Researcher Affiliation	Academia	Sudeep Salgia Carnegie Mellon University ssalgia@andrew.cmu.edu Yuejie Chi Carnegie Mellon University yuejiechi@cmu.edu
Pseudocode	Yes	Algorithm 1: A generic algorithm A, Algorithm 2: Fed-DVR-Q, Algorithm 3: REFINEESTIMATE(Q, B, I, L, D, J)
Open Source Code	No	The paper does not have associated code or data. (From NeurIPS Paper Checklist, Question 5)
Open Datasets	No	In this work, we consider the synchronous setting, where each agent has access to an independent generative model or simulator from which they can draw independent samples from the unknown underlying distribution P( · \|s, a) for each state-action pair (s, a) [Kearns and Singh, 1998].
Dataset Splits	No	The paper focuses on theoretical sample and communication complexity with a generative model, and does not describe explicit training, validation, or test dataset splits in the context of empirical evaluation.
Hardware Specification	No	The empirical studies require no specific compute resources can be easily completed on a regular laptop. (From NeurIPS Paper Checklist, Question 8)
Software Dependencies	No	The paper refers to specific algorithms like Fed-DVR-Q and Fed-Syn Q, and general parameters (e.g., number of samples, error rates), but does not specify software dependencies with version numbers.
Experiment Setup	Yes	We consider a Federated Q-learning setting with 5 agents. The parameters for both the algorithms were set to the suggested values in the respective papers. Both the algorithms were run with 10^7 samples at each agent...We vary the number of agents from 5 to 25 in multiples of 5 and record the sample and communication complexity to achieve an error rate of ε = 0.03...We run the algorithm to achieve an accuracy of ε = 0.1 with parameter choices prescribed in Sec. 4.1.3.