Neural Agents Struggle to Take Turns in Bidirectional Emergent Communication
Authors: Valentin Taillandier, Dieuwke Hupkes, Benoît Sagot, Emmanuel Dupoux, Paul Michel
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In experiments, we find that simple neural-network-based agents trained with reinforcement learning do not consistently develop natural turn-taking strategies. However, agents that do develop a turn-taking protocol are able to achieve a much higher score, sometimes solving the task perfectly. |
| Researcher Affiliation | Collaboration | Valentin Taillandier ENS PSL, Inria valentin.taillandier@gmail.com Dieuwke Hupkes Meta AI Research dieuwkehupkes@fb.com Benoˆıt Sagot Inria benoit.sagot@inria.fr Emmanuel Dupoux Meta AI Research dpx@fb.com Paul Michel ENS PSL, Inria pmichel31415@gmail.com |
| Pseudocode | No | The paper describes the model architecture and training process in text and with a diagram (Figure 2), but does not include any explicit pseudocode or algorithm blocks. |
| Open Source Code | Yes | The code used for the analysis in this paper is publicly available on a Git Hub repository.3 3https://github.com/v Hitsuji/turntaking |
| Open Datasets | No | To facilitate analysis and enable fine-grained control of the difficulty of the game, in the remaining of this paper we consider games where the underlying objects x are simple attribute-value vectors (Kottur et al., 2017; Chaabouni et al., 2020). Specifically, each object x is a vector of Na attributes, each of which can take Nv distinct values, for a total of N Na v possible combinations. We represent a partial view ˆx on an object x by masking Nm of its attributes. The paper describes how data is *generated* for the game, not a pre-existing public dataset with access info. |
| Dataset Splits | Yes | Unless specified otherwise, agents are optimized on a training set of 106 examples. We also sample a validation and test set of 1,000 examples each. |
| Hardware Specification | Yes | All experiments are run on single Nvidia V100 GPUs (16 or 32GB VRAM). |
| Software Dependencies | No | Our implementation is written in Pytorch (Paszke et al., 2019), and is based on the EGG framework (Kharitonov et al., 2021). No version numbers provided for PyTorch or EGG. |
| Experiment Setup | Yes | Agents are trained using the Adam optimizer (Kingma & Ba, 2015) with a learning rate of 0.001, a batch size of 2048 and a weight of 0.001 for the entropy term. We train for a total of 600, 000 steps, and keep the pair of agents with the highest accuracy on the validation set. |