Learning Multiagent Communication with Backpropagation

Authors: Sainbayar Sukhbaatar, arthur szlam, Rob Fergus

NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We explore this model on a range of tasks. In some, supervision is provided for each action while for others it is given sporadically. In the former case, the controller for each agent is trained by backpropagating the error signal through the connectivity structure of the model, enabling the agents to learn how to communicate amongst themselves to maximize the objective. In the latter case, reinforcement learning must be used as an additional outer loop to provide a training signal at each time step (see the supplementary material for details).
Researcher Affiliation Collaboration Sainbayar Sukhbaatar Dept. of Computer Science Courant Institute, New York University sainbar@cs.nyu.edu Arthur Szlam Facebook AI Research New York aszlam@fb.com Rob Fergus Facebook AI Research New York robfergus@fb.com
Pseudocode No The paper describes its model using mathematical equations and diagrams but does not include any explicit pseudocode or algorithm blocks.
Open Source Code Yes Code for our model (and baselines) can be found at http://cims.nyu.edu/~sainbar/commnet/.
Open Datasets Yes Multi-turn Games In this section, we consider two multi-agent tasks using the Maze Base environment [26]
Dataset Splits Yes We used 10% of training data as validation set to find optimal hyper-parameters for the model.
Hardware Specification No The paper mentions training on 'multiple CPU cores' but does not specify any particular hardware models (CPU, GPU) or other hardware specifications.
Software Dependencies No The paper mentions optimizers (RMSProp, Adam) and network architectures (LSTM) but does not provide specific software dependencies or library versions (e.g., Python, TensorFlow, PyTorch versions).
Experiment Setup Yes Both tasks are trained for 300 epochs, each epoch being 100 weight updates with RMSProp [31] on mini-batch of 288 game episodes... In all modules, the hidden layer size is set to 50.