Learning Translations: Emergent Communication Pretraining for Cooperative Language Acquisition
Authors: Dylan Cope, Peter McBurney
IJCAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We propose and compare two methods for solving CLAPs: Behaviour Cloning (BC), and Emergent Communication pretraining and Translation Learning (ECTL)... We introduce two environments and train target communities of agents that cooperate via a learned communication protocol. Data is then gathered from these communities and then used to train joiner agents with ECTL and the behaviour cloning (BC) algorithm [Sammut, 2010]. We demonstrate that ECTL is more robust than BC to expert demonstrations that give an incomplete picture of the underlying problem and that ECTL is significantly more effective than BC when communications data is limited. Finally, we apply these methods to user-generated data and show that ECTL can learn to communicate with a human to cooperatively solve a task. |
| Researcher Affiliation | Academia | Dylan Cope and Peter Mc Burney King s College London dylan.cope@kcl.ac.uk |
| Pseudocode | No | No structured pseudocode or algorithm blocks were found. The paper contains architecture diagrams (Figure 1a, Figure 2) but not formal pseudocode. |
| Open Source Code | Yes | Further training details and hyperparameters can be found in the project s Git Hub repository1. 1https://github.com/Dylan Cope/learning-translations |
| Open Datasets | No | The paper describes creating custom datasets: 'Data is then gathered from these communities and then used to train joiner agents with ECTL' and 'collecting interaction datasets from Ncollect = 100 episodes of Π acting in the environment.' Also, 'We developed an interactive UI through which a user could simultaneously control two agents in the driving game... This was then used to collect data from 70 episodes.' However, no concrete access information (link, DOI, citation) for these datasets is provided, implying they are not publicly available. |
| Dataset Splits | No | The paper mentions training and evaluation episodes, but it does not provide specific details on train/validation/test dataset splits (e.g., percentages, sample counts, or explicit instructions for partitioning data for reproduction). |
| Hardware Specification | No | The paper mentions 'compute infrastructure' in the acknowledgements but does not provide any specific hardware details such as GPU models, CPU types, or cloud computing instance specifications used for the experiments. |
| Software Dependencies | No | The paper refers to using 'Multi-Agent Proximal Policy Optimisation (MAPPO)' and 'A Gumbel-Softmax function' but does not explicitly list software dependencies with specific version numbers (e.g., 'PyTorch 1.9', 'Python 3.8'). While it mentions a GitHub repository for 'Further training details and hyperparameters', the paper itself lacks this information. |
| Experiment Setup | No | The paper states 'Further training details and hyperparameters can be found in the project s Git Hub repository1.' (referencing a GitHub link) but does not provide specific hyperparameter values (e.g., learning rate, batch size, number of epochs) or detailed training configurations within the main text. |