Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Neural Agents Struggle to Take Turns in Bidirectional Emergent Communication
Authors: Valentin Taillandier, Dieuwke Hupkes, Benoît Sagot, Emmanuel Dupoux, Paul Michel
ICLR 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In experiments, we find that simple neural-network-based agents trained with reinforcement learning do not consistently develop natural turn-taking strategies. However, agents that do develop a turn-taking protocol are able to achieve a much higher score, sometimes solving the task perfectly. |
| Researcher Affiliation | Collaboration | Valentin Taillandier ENS PSL, Inria EMAIL Dieuwke Hupkes Meta AI Research EMAIL Benoˆıt Sagot Inria EMAIL Emmanuel Dupoux Meta AI Research EMAIL Paul Michel ENS PSL, Inria EMAIL |
| Pseudocode | No | The paper describes the model architecture and training process in text and with a diagram (Figure 2), but does not include any explicit pseudocode or algorithm blocks. |
| Open Source Code | Yes | The code used for the analysis in this paper is publicly available on a Git Hub repository.3 3https://github.com/v Hitsuji/turntaking |
| Open Datasets | No | To facilitate analysis and enable fine-grained control of the difficulty of the game, in the remaining of this paper we consider games where the underlying objects x are simple attribute-value vectors (Kottur et al., 2017; Chaabouni et al., 2020). Specifically, each object x is a vector of Na attributes, each of which can take Nv distinct values, for a total of N Na v possible combinations. We represent a partial view ˆx on an object x by masking Nm of its attributes. The paper describes how data is *generated* for the game, not a pre-existing public dataset with access info. |
| Dataset Splits | Yes | Unless specified otherwise, agents are optimized on a training set of 106 examples. We also sample a validation and test set of 1,000 examples each. |
| Hardware Specification | Yes | All experiments are run on single Nvidia V100 GPUs (16 or 32GB VRAM). |
| Software Dependencies | No | Our implementation is written in Pytorch (Paszke et al., 2019), and is based on the EGG framework (Kharitonov et al., 2021). No version numbers provided for PyTorch or EGG. |
| Experiment Setup | Yes | Agents are trained using the Adam optimizer (Kingma & Ba, 2015) with a learning rate of 0.001, a batch size of 2048 and a weight of 0.001 for the entropy term. We train for a total of 600, 000 steps, and keep the pair of agents with the highest accuracy on the validation set. |