Dynamic population-based meta-learning for multi-agent communication with natural language

Authors: Abhinav Gupta, Marc Lanctot, Angeliki Lazaridou

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We present extensive experiments in two referential game environments (an image-based and a text-based defined in 4) using natural language communication and compare our model against previous approaches, strong baselines and ablations of our method.
Researcher Affiliation Collaboration Abhinav Gupta MILA abhinavg@nyu.edu Marc Lanctot Deep Mind lanctot@deepmind.com Angeliki Lazaridou Deep Mind angeliki@deepmind.com
Pseudocode Yes Algorithm 1: Algorithm Input :Dataset D, collection of objects O, randomly initialized speaker parameters 0, listener parameters φ0, meta-speaker parameters #0, and meta-listener parameters '0, empty buffers B0L 1 Supervised Learning({m, t} D; 0). Eq (3) φ1 Supervised Learning({m, t, D} D; φ0). Eq (4) i 1 repeat...
Open Source Code No No explicit statement or link providing access to the source code for the described methodology was found.
Open Datasets Yes We use the MSCOCO dataset [30] to obtain real images and the corresponding ground truth English captions annotated by humans.
Dataset Splits Yes We use 5000 images as training set and 1000 images as validation/test set.
Hardware Specification No The paper states: 'We thank Compute Canada for providing the compute resources to run the experiments.' but does not specify exact hardware details such as GPU models, CPU types, or memory.
Software Dependencies No The paper mentions using PyTorch [40] but does not provide a specific version number, nor does it list other software dependencies with version numbers.
Experiment Setup Yes Both speaker and listener are parameterized with recurrent neural networks (GRU [4]) of size 512 with an embedding layer of size 256. We embed images using a Resnet-50 model [16] (pretrained on Image Net [7]). We set the vocabulary size to 100 and the maximum length of the sentences at 15. We use the same speaker and listener buffer size of 200 for reservoir sampling. Other implementation details are given in the Appendix.