Learning Translations: Emergent Communication Pretraining for Cooperative Language Acquisition

Authors: Dylan Cope, Peter McBurney

IJCAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We propose and compare two methods for solving CLAPs: Behaviour Cloning (BC), and Emergent Communication pretraining and Translation Learning (ECTL)... We introduce two environments and train target communities of agents that cooperate via a learned communication protocol. Data is then gathered from these communities and then used to train joiner agents with ECTL and the behaviour cloning (BC) algorithm [Sammut, 2010]. We demonstrate that ECTL is more robust than BC to expert demonstrations that give an incomplete picture of the underlying problem and that ECTL is significantly more effective than BC when communications data is limited. Finally, we apply these methods to user-generated data and show that ECTL can learn to communicate with a human to cooperatively solve a task.
Researcher Affiliation Academia Dylan Cope and Peter Mc Burney King s College London dylan.cope@kcl.ac.uk
Pseudocode No No structured pseudocode or algorithm blocks were found. The paper contains architecture diagrams (Figure 1a, Figure 2) but not formal pseudocode.
Open Source Code Yes Further training details and hyperparameters can be found in the project s Git Hub repository1. 1https://github.com/Dylan Cope/learning-translations
Open Datasets No The paper describes creating custom datasets: 'Data is then gathered from these communities and then used to train joiner agents with ECTL' and 'collecting interaction datasets from Ncollect = 100 episodes of Π acting in the environment.' Also, 'We developed an interactive UI through which a user could simultaneously control two agents in the driving game... This was then used to collect data from 70 episodes.' However, no concrete access information (link, DOI, citation) for these datasets is provided, implying they are not publicly available.
Dataset Splits No The paper mentions training and evaluation episodes, but it does not provide specific details on train/validation/test dataset splits (e.g., percentages, sample counts, or explicit instructions for partitioning data for reproduction).
Hardware Specification No The paper mentions 'compute infrastructure' in the acknowledgements but does not provide any specific hardware details such as GPU models, CPU types, or cloud computing instance specifications used for the experiments.
Software Dependencies No The paper refers to using 'Multi-Agent Proximal Policy Optimisation (MAPPO)' and 'A Gumbel-Softmax function' but does not explicitly list software dependencies with specific version numbers (e.g., 'PyTorch 1.9', 'Python 3.8'). While it mentions a GitHub repository for 'Further training details and hyperparameters', the paper itself lacks this information.
Experiment Setup No The paper states 'Further training details and hyperparameters can be found in the project s Git Hub repository1.' (referencing a GitHub link) but does not provide specific hyperparameter values (e.g., learning rate, batch size, number of epochs) or detailed training configurations within the main text.