reproducibilityindex.ai

Universally Expressive Communication in Multi-Agent Reinforcement Learning

Authors: Matthew Morris, Thomas D Barrett, Arnu Pretorius

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirically, these augmentations are found to improve performance on tasks where expressive communication is required, whilst, in general, the optimal communication protocol is found to be task-dependent.
Researcher Affiliation	Collaboration	Matthew Morris Insta Deep Ltd. & University of Oxford matthew.morris@cs.ox.ac.uk Thomas D. Barrett Insta Deep Ltd. t.barrett@instadeep.com Arnu Pretorius Insta Deep Ltd. a.pretorius@instadeep.com
Pseudocode	No	No structured pseudocode or algorithm blocks were found in the paper.
Open Source Code	Yes	Code, environments, and instructions for reproducibility are included in the supplemental material.
Open Datasets	Yes	Predator-Prey [9, 27, 29, 37, 49] and Trafﬁc Junction [9, 27, 29, 37, 49, 51] are common MARL communication benchmarks. ... We also introduce two new environments, Drone Scatter and Box Pushing, to respectively test symmetry-breaking and communication expressivity beyond 1-WL. ... New environments and code provided in the supplemental material.
Dataset Splits	Yes	Each epoch consists of 5000 training episodes, after which 100 evaluation episodes are used to report aggregate metric scores, yielding an evaluation score for the model after every epoch.
Hardware Specification	Yes	All experiments were run on a single machine equipped with an Intel Core i9-9900K CPU, 64GB of RAM, and an NVIDIA GeForce RTX 3090 GPU.
Software Dependencies	Yes	Our code base is written in Python 3.8. We use PyTorch 1.10.1 as our deep learning framework and OpenAI Gym 0.21.0 for the environment interface.
Experiment Setup	Yes	Full experiment and hyperparameter details can be found in Appendix C, and full results are shown in Appendix D. ... For each scenario and for every baseline communication method, we compare 4 models: the baseline without modiﬁcations, the baseline augmented with unique IDs for each agent, the baseline augmented with 0.75 RNI, and ﬁnally 0.25 RNI.