Cooperative Multi-Agent Reinforcement Learning: Asynchronous Communication and Linear Function Approximation
Authors: Yifei Min, Jiafan He, Tianhao Wang, Quanquan Gu
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We propose a provably efficient algorithm based on value iteration that enable asynchronous communication while ensuring the advantage of cooperation with low communication overhead. With linear function approximation, we prove that our algorithm enjoys an e O(d3/2H2 K) regret with e O(d HM 2) communication complexity... We also provide a lower bound showing that a minimal Ω(d M) communication complexity is required to improve the performance through collaboration. |
| Researcher Affiliation | Academia | Yifei Min * 1 Jiafan He * 2 Tianhao Wang * 1 Quanquan Gu 2 1Department of Statistics and Data Science, Yale University 2Department of Computer Science, University of California, Los Angeles. Correspondence to: Quanquan Gu <qgu@cs.ucla.edu>. |
| Pseudocode | Yes | Algorithm 1 Communication Protocol |
| Open Source Code | No | The paper does not provide an explicit statement about releasing open-source code or a link to a code repository for the described methodology. |
| Open Datasets | No | The paper is theoretical and focuses on algorithmic design and theoretical analysis; therefore, it does not mention specific datasets used for training or their public availability. |
| Dataset Splits | No | The paper is theoretical and does not describe experimental validation; therefore, it does not provide dataset splits for training, validation, or testing. |
| Hardware Specification | No | The paper does not specify any particular hardware used for its research or theoretical derivations. |
| Software Dependencies | No | The paper does not mention any specific software dependencies with version numbers. |
| Experiment Setup | No | The paper is theoretical and does not include details about an experimental setup, such as hyperparameters or system-level training settings. |