Multi-Agent Common Knowledge Reinforcement Learning

Authors: Christian Schroeder de Witt, Jakob Foerster, Gregory Farquhar, Philip Torr, Wendelin Boehmer, Shimon Whiteson

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate Pairwise MACKRL (henceforth referred to as MACKRL) on two environments3 : first, we use a matrix game with special coordination requirements to illustrate MACKRL s ability to surpass both IL and JAL. Secondly, we employ MACKRL with deep recurrent neural network policies in order to outperform state-of-the-art baselines on a number of challenging Star Craft II unit micromanagement tasks.
Researcher Affiliation Academia Correspondence to Christian Schroeder de Witt <cs@robots.ox.ac.uk> University of Oxford, UK
Pseudocode Yes Algorithm 1 Decentralised action selection for agent a 2 A in MACKRL; Algorithm 2 Compute joint policies for a given u G env of a group of agents G in MACKRL
Open Source Code Yes All source code is available at https://github.com/schroederdewitt/mackrl.
Open Datasets Yes We then apply MACKRL to challenging Star Craft II unit micromanagement tasks (Vinyals et al., 2017) from the Star Craft Multi-Agent Challenge (SMAC, Samvelyan et al., 2019).
Dataset Splits No All experiments use SMAC settings for comparability (see Samvelyan et al. (2019) and Appendix B for details).
Hardware Specification No It was also supported by the Oxford-Google Deep Mind Graduate Scholarship and a generous equipment grant from NVIDIA.
Software Dependencies No No specific software versions or dependencies with version numbers are provided in the paper.
Experiment Setup No All experiments use SMAC settings for comparability (see Samvelyan et al. (2019) and Appendix B for details). In addition, MACKRL and its within-class baseline Central-V share equal hyper-parameters as far as applicable.