reproducibilityindex.ai

Multi-Agent Common Knowledge Reinforcement Learning

Authors: Christian Schroeder de Witt, Jakob Foerster, Gregory Farquhar, Philip Torr, Wendelin Boehmer, Shimon Whiteson

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate Pairwise MACKRL (henceforth referred to as MACKRL) on two environments3 : ﬁrst, we use a matrix game with special coordination requirements to illustrate MACKRL s ability to surpass both IL and JAL. Secondly, we employ MACKRL with deep recurrent neural network policies in order to outperform state-of-the-art baselines on a number of challenging Star Craft II unit micromanagement tasks.
Researcher Affiliation	Academia	Correspondence to Christian Schroeder de Witt <cs@robots.ox.ac.uk> University of Oxford, UK
Pseudocode	Yes	Algorithm 1 Decentralised action selection for agent a 2 A in MACKRL; Algorithm 2 Compute joint policies for a given u G env of a group of agents G in MACKRL
Open Source Code	Yes	All source code is available at https://github.com/schroederdewitt/mackrl.
Open Datasets	Yes	We then apply MACKRL to challenging Star Craft II unit micromanagement tasks (Vinyals et al., 2017) from the Star Craft Multi-Agent Challenge (SMAC, Samvelyan et al., 2019).
Dataset Splits	No	All experiments use SMAC settings for comparability (see Samvelyan et al. (2019) and Appendix B for details).
Hardware Specification	No	It was also supported by the Oxford-Google Deep Mind Graduate Scholarship and a generous equipment grant from NVIDIA.
Software Dependencies	No	No specific software versions or dependencies with version numbers are provided in the paper.
Experiment Setup	No	All experiments use SMAC settings for comparability (see Samvelyan et al. (2019) and Appendix B for details). In addition, MACKRL and its within-class baseline Central-V share equal hyper-parameters as far as applicable.