Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Expressivity of Emergent Languages is a Trade-off between Contextual Complexity and Unpredictability
Authors: Shangmin Guo, Yi Ren, Kory Wallace Mathewson, Simon Kirby, Stefano V Albrecht, Kenny Smith
ICLR 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We measure the expressivity of emergent languages based on the generalisation performance across different games, and demonstrate that the expressivity of emergent languages is a trade-off between the complexity and unpredictability of the context those languages emerged from. |
| Researcher Affiliation | Collaboration | Shangmin Guo: , Yi Ren;, Kory Mathewson , Simon Kirby:, Stefano V. Albrecht:, Kenny Smith: :University of Edinburgh, ;University of British Columbia, Deep Mind |
| Pseudocode | Yes | Procedure 1: Procedure for the language emergence and transfer experiment Input: A set of source game Gs, a set of target game Gt for every game gi s in Gs do 1. initialise a new speaker and listener for gi s, and train them to play gi s with the whole X; 2. after the agents converge on gi s, record L tpx, mq|x P Xu; 3. randomly shuffle and split L into 2 disjoint sets Ltrain and Ltest s.t. |Ltrain| 90% |L|; 4. for every game gj t in Gt do 1. initialise a new listener for gj t ; 2. train the listener with Ltrain to complete gj t ; 3. record the accuracy of listener on Ltest as the generalisation performance of gi s on gj t ; end end |
| Open Source Code | Yes | 3Codes are released at https://github.com/uoe-agents/Expressivity-of-Emergent-Languages. |
| Open Datasets | No | The paper describes the composition of its synthetic data ('input space X from which the speaker s observations are drawn, which consists of 10, 000 possible inputs'), and the code for generating it is likely available with the open-source code, but it does not provide a direct link, DOI, or citation for a pre-existing, named public dataset explicitly used or made available for download. |
| Dataset Splits | No | Procedure 1, step 3 states: 'randomly shuffle and split L into 2 disjoint sets Ltrain and Ltest s.t. |Ltrain| 90% |L|'. This describes training and test splits but does not explicitly mention a separate 'validation' set. |
| Hardware Specification | Yes | The experiments across 6 random seeds, 18 source games, 11 target games took 4, 216 hours in total, on Nvidia Tesla P100. |
| Software Dependencies | No | The paper mentions using the EGG framework, Adam algorithm, and Gumbel-Softmax trick, but it does not provide specific version numbers for these or other software libraries/dependencies. |
| Experiment Setup | Yes | As for updating the parameters, we use the Adam algorithm introduced by Kingma & Ba (2015), and the learning rate is set to 10-4. To allow the gradients being propagated through the discrete channel to overcome the sampling issue of messages, we apply the Gumbel-Softmax trick proposed by Jang et al. (2020), and the temperature hyper-parameter τ is set to 1.0. |