reproducibilityindex.ai

Biases for Emergent Communication in Multi-agent Reinforcement Learning

Authors: Tom Eccles, Yoram Bachrach, Guy Lever, Angeliki Lazaridou, Thore Graepel

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	4 Empirical Analysis We consider two environments. The ﬁrst is a simple one-step environment, where agents must sum MNIST digits by communicating their value. ... The second environment is a new multi-step MARL environment which we name Treasure Hunt.
Researcher Affiliation	Industry	Tom Eccles Deep Mind London, UK eccles@google.com Yoram Bachrach Deep Mind London, UK yorambac@google.com Guy Lever Deep Mind London, UK guylever@google.com Angeliki Lazaridou Deep Mind London, UK angeliki@google.com Thore Graepel Deep Mind London, UK thore@google.com
Pseudocode	Yes	Algorithm 1 Calculation of positive signalling loss
Open Source Code	No	The paper does not include any explicit statements or links indicating that the source code for the described methodology is publicly available.
Open Datasets	Yes	4.1 Summing MNIST digits In this task, depicted in Figure 1, the speaker and listener agents each observe a different MNIST digit (as an image), and must determine the sum of the digits.
Dataset Splits	No	The paper mentions training agents and uses "batch of rollouts" but does not explicitly provide specific train/validation/test dataset splits (e.g., percentages, sample counts, or predefined split references).
Hardware Specification	No	The paper does not explicitly describe the specific hardware used (e.g., GPU/CPU models, memory amounts, or detailed computer specifications) to run its experiments.
Software Dependencies	No	The paper mentions algorithms and methods like REINFORCE, Advantage Actor-Critic, V-trace, and RMSProp, but it does not provide specific version numbers for any software libraries or dependencies (e.g., Python, TensorFlow, PyTorch versions).
Experiment Setup	No	The full details of the Treasure Hunt environment, together with the hyperparameters used in our agents, can be found in the supplementary material. This indicates the details are not in the main text.