Emergent Communication at Scale

Authors: Rahma Chaabouni, Florian Strub, Florent Altché, Eugene Tarassov, Corentin Tallec, Elnaz Davoodi, Kory Wallace Mathewson, Olivier Tieleman, Angeliki Lazaridou, Bilal Piot

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Overall, our experiments provide a large spectrum of observations, both positive and negative.
Researcher Affiliation Industry Contributed equally. Corresponding authors: {rahmac,fstrub,piot}@deepmind.com
Pseudocode Yes Listing 2: Face reconstruction Head
Open Source Code Yes 1 Source code: github.com/deepmind/emergent_communication_at_scale
Open Datasets Yes We use the Image Net (Deng et al., 2009; Russakovsky et al., 2015), and Celeb A datasets (Liu et al., 2015), which respectively contain 1400k and 200k labelled images.
Dataset Splits Yes In our experiments, we use 99% of the official train set for training, i.e., 1300k images, the last 1% of the train set for validation, i.e., 13k images, and the official validation set as our test set (i.e., 50k images).
Hardware Specification Yes Table 5: Computational requirements for our base setup. GPU memory refers to the peak GPU memory usage. Device: p100, v100
Software Dependencies No The paper mentions 'Jaxline pipeline (Babuschkin et al., 2020)' and 'Adam optimisers (Kingma & Ba, 2015)' as software used. However, it does not specify version numbers for these or any other software components or libraries.
Experiment Setup Yes Table 3: Hyper-parameters values across datasets and settings. Learning rate lr 0.0001 Batch training size |X| 1024 Number of Candidates |C| 1024 Number of agent sampled P min(N, 10) KL coefficient β 0.5 KL EMA η 0.99 Entropy Coefficient α 0.0001 Vocabulary size |W| 20 Message Length T 10 Imitation EMA µ 0.99