Efficient Communication in Multi-Agent Reinforcement Learning via Variance Based Control
Authors: Sai Qian Zhang, Qi Zhang, Jieyu Lin
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our evaluation using multiple MARL benchmarks indicates that our method achieves 2 10 lower in communication overhead than state-of-the-art MARL algorithms, while allowing agents to achieve better overall performance. |
| Researcher Affiliation | Collaboration | Sai Qian Zhang Harvard University Qi Zhang Amazon Inc. Jieyu Lin University of Toronto |
| Pseudocode | Yes | Algorithm 1: Communication protocol at agent i |
| Open Source Code | Yes | The code is available at https://github.com/saizhang0218/VBC. |
| Open Datasets | Yes | For evaluation, we test VBC on several MARL benchmarks, including Star Craft Multi-Agent Challenge [15], Cooperative Navigation (CN) [10] and Predator-prey (PP) [8]. |
| Dataset Splits | No | The paper describes training duration (e.g., '2 million and 4 million episodes') and test episodes ('20 test episodes'), but does not explicitly specify train/validation/test dataset splits, percentages, or sample counts. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running the experiments (e.g., GPU/CPU models, memory). |
| Software Dependencies | No | The paper does not list specific software dependencies with version numbers. |
| Experiment Setup | Yes | For hyperparameters used by VBC (i.e., λ used in equation (1), δ1andδ2 in Algorithm 1), we first search for a coarse parameter range based on random trial, experience and message statistics. We then perform a random search within a smaller hyperparameter space. Best selections are shown in the legend of each figure. |