Neural Agents Struggle to Take Turns in Bidirectional Emergent Communication

Authors: Valentin Taillandier, Dieuwke Hupkes, Benoît Sagot, Emmanuel Dupoux, Paul Michel

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In experiments, we find that simple neural-network-based agents trained with reinforcement learning do not consistently develop natural turn-taking strategies. However, agents that do develop a turn-taking protocol are able to achieve a much higher score, sometimes solving the task perfectly.
Researcher Affiliation Collaboration Valentin Taillandier ENS PSL, Inria valentin.taillandier@gmail.com Dieuwke Hupkes Meta AI Research dieuwkehupkes@fb.com Benoˆıt Sagot Inria benoit.sagot@inria.fr Emmanuel Dupoux Meta AI Research dpx@fb.com Paul Michel ENS PSL, Inria pmichel31415@gmail.com
Pseudocode No The paper describes the model architecture and training process in text and with a diagram (Figure 2), but does not include any explicit pseudocode or algorithm blocks.
Open Source Code Yes The code used for the analysis in this paper is publicly available on a Git Hub repository.3 3https://github.com/v Hitsuji/turntaking
Open Datasets No To facilitate analysis and enable fine-grained control of the difficulty of the game, in the remaining of this paper we consider games where the underlying objects x are simple attribute-value vectors (Kottur et al., 2017; Chaabouni et al., 2020). Specifically, each object x is a vector of Na attributes, each of which can take Nv distinct values, for a total of N Na v possible combinations. We represent a partial view ˆx on an object x by masking Nm of its attributes. The paper describes how data is *generated* for the game, not a pre-existing public dataset with access info.
Dataset Splits Yes Unless specified otherwise, agents are optimized on a training set of 106 examples. We also sample a validation and test set of 1,000 examples each.
Hardware Specification Yes All experiments are run on single Nvidia V100 GPUs (16 or 32GB VRAM).
Software Dependencies No Our implementation is written in Pytorch (Paszke et al., 2019), and is based on the EGG framework (Kharitonov et al., 2021). No version numbers provided for PyTorch or EGG.
Experiment Setup Yes Agents are trained using the Adam optimizer (Kingma & Ba, 2015) with a learning rate of 0.001, a batch size of 2048 and a weight of 0.001 for the entropy term. We train for a total of 600, 000 steps, and keep the pair of agents with the highest accuracy on the validation set.