Multi-Agent Common Knowledge Reinforcement Learning
Authors: Christian Schroeder de Witt, Jakob Foerster, Gregory Farquhar, Philip Torr, Wendelin Boehmer, Shimon Whiteson
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate Pairwise MACKRL (henceforth referred to as MACKRL) on two environments3 : first, we use a matrix game with special coordination requirements to illustrate MACKRL s ability to surpass both IL and JAL. Secondly, we employ MACKRL with deep recurrent neural network policies in order to outperform state-of-the-art baselines on a number of challenging Star Craft II unit micromanagement tasks. |
| Researcher Affiliation | Academia | Correspondence to Christian Schroeder de Witt <cs@robots.ox.ac.uk> University of Oxford, UK |
| Pseudocode | Yes | Algorithm 1 Decentralised action selection for agent a 2 A in MACKRL; Algorithm 2 Compute joint policies for a given u G env of a group of agents G in MACKRL |
| Open Source Code | Yes | All source code is available at https://github.com/schroederdewitt/mackrl. |
| Open Datasets | Yes | We then apply MACKRL to challenging Star Craft II unit micromanagement tasks (Vinyals et al., 2017) from the Star Craft Multi-Agent Challenge (SMAC, Samvelyan et al., 2019). |
| Dataset Splits | No | All experiments use SMAC settings for comparability (see Samvelyan et al. (2019) and Appendix B for details). |
| Hardware Specification | No | It was also supported by the Oxford-Google Deep Mind Graduate Scholarship and a generous equipment grant from NVIDIA. |
| Software Dependencies | No | No specific software versions or dependencies with version numbers are provided in the paper. |
| Experiment Setup | No | All experiments use SMAC settings for comparability (see Samvelyan et al. (2019) and Appendix B for details). In addition, MACKRL and its within-class baseline Central-V share equal hyper-parameters as far as applicable. |