Emergent Communication at Scale
Authors: Rahma Chaabouni, Florian Strub, Florent Altché, Eugene Tarassov, Corentin Tallec, Elnaz Davoodi, Kory Wallace Mathewson, Olivier Tieleman, Angeliki Lazaridou, Bilal Piot
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Overall, our experiments provide a large spectrum of observations, both positive and negative. |
| Researcher Affiliation | Industry | Contributed equally. Corresponding authors: {rahmac,fstrub,piot}@deepmind.com |
| Pseudocode | Yes | Listing 2: Face reconstruction Head |
| Open Source Code | Yes | 1 Source code: github.com/deepmind/emergent_communication_at_scale |
| Open Datasets | Yes | We use the Image Net (Deng et al., 2009; Russakovsky et al., 2015), and Celeb A datasets (Liu et al., 2015), which respectively contain 1400k and 200k labelled images. |
| Dataset Splits | Yes | In our experiments, we use 99% of the official train set for training, i.e., 1300k images, the last 1% of the train set for validation, i.e., 13k images, and the official validation set as our test set (i.e., 50k images). |
| Hardware Specification | Yes | Table 5: Computational requirements for our base setup. GPU memory refers to the peak GPU memory usage. Device: p100, v100 |
| Software Dependencies | No | The paper mentions 'Jaxline pipeline (Babuschkin et al., 2020)' and 'Adam optimisers (Kingma & Ba, 2015)' as software used. However, it does not specify version numbers for these or any other software components or libraries. |
| Experiment Setup | Yes | Table 3: Hyper-parameters values across datasets and settings. Learning rate lr 0.0001 Batch training size |X| 1024 Number of Candidates |C| 1024 Number of agent sampled P min(N, 10) KL coefficient β 0.5 KL EMA η 0.99 Entropy Coefficient α 0.0001 Vocabulary size |W| 20 Message Length T 10 Imitation EMA µ 0.99 |