reproducibilityindex.ai

Learning Multiagent Communication with Backpropagation

Authors: Sainbayar Sukhbaatar, arthur szlam, Rob Fergus

NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We explore this model on a range of tasks. In some, supervision is provided for each action while for others it is given sporadically. In the former case, the controller for each agent is trained by backpropagating the error signal through the connectivity structure of the model, enabling the agents to learn how to communicate amongst themselves to maximize the objective. In the latter case, reinforcement learning must be used as an additional outer loop to provide a training signal at each time step (see the supplementary material for details).
Researcher Affiliation	Collaboration	Sainbayar Sukhbaatar Dept. of Computer Science Courant Institute, New York University sainbar@cs.nyu.edu Arthur Szlam Facebook AI Research New York aszlam@fb.com Rob Fergus Facebook AI Research New York robfergus@fb.com
Pseudocode	No	The paper describes its model using mathematical equations and diagrams but does not include any explicit pseudocode or algorithm blocks.
Open Source Code	Yes	Code for our model (and baselines) can be found at http://cims.nyu.edu/~sainbar/commnet/.
Open Datasets	Yes	Multi-turn Games In this section, we consider two multi-agent tasks using the Maze Base environment [26]
Dataset Splits	Yes	We used 10% of training data as validation set to ﬁnd optimal hyper-parameters for the model.
Hardware Specification	No	The paper mentions training on 'multiple CPU cores' but does not specify any particular hardware models (CPU, GPU) or other hardware specifications.
Software Dependencies	No	The paper mentions optimizers (RMSProp, Adam) and network architectures (LSTM) but does not provide specific software dependencies or library versions (e.g., Python, TensorFlow, PyTorch versions).
Experiment Setup	Yes	Both tasks are trained for 300 epochs, each epoch being 100 weight updates with RMSProp [31] on mini-batch of 288 game episodes... In all modules, the hidden layer size is set to 50.