Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Cooperative Multi-Agent Reinforcement Learning: Asynchronous Communication and Linear Function Approximation
Authors: Yifei Min, Jiafan He, Tianhao Wang, Quanquan Gu
ICML 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We propose a provably efficient algorithm based on value iteration that enable asynchronous communication while ensuring the advantage of cooperation with low communication overhead. With linear function approximation, we prove that our algorithm enjoys an e O(d3/2H2 K) regret with e O(d HM 2) communication complexity... We also provide a lower bound showing that a minimal Ω(d M) communication complexity is required to improve the performance through collaboration. |
| Researcher Affiliation | Academia | Yifei Min * 1 Jiafan He * 2 Tianhao Wang * 1 Quanquan Gu 2 1Department of Statistics and Data Science, Yale University 2Department of Computer Science, University of California, Los Angeles. Correspondence to: Quanquan Gu <EMAIL>. |
| Pseudocode | Yes | Algorithm 1 Communication Protocol |
| Open Source Code | No | The paper does not provide an explicit statement about releasing open-source code or a link to a code repository for the described methodology. |
| Open Datasets | No | The paper is theoretical and focuses on algorithmic design and theoretical analysis; therefore, it does not mention specific datasets used for training or their public availability. |
| Dataset Splits | No | The paper is theoretical and does not describe experimental validation; therefore, it does not provide dataset splits for training, validation, or testing. |
| Hardware Specification | No | The paper does not specify any particular hardware used for its research or theoretical derivations. |
| Software Dependencies | No | The paper does not mention any specific software dependencies with version numbers. |
| Experiment Setup | No | The paper is theoretical and does not include details about an experimental setup, such as hyperparameters or system-level training settings. |