Incorporating Pragmatic Reasoning Communication into Emergent Language
Authors: Yipeng Kang, Tonghan Wang, Gerard de Melo
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 4 Experiments. 4.1 Experimental Setup. We conduct a series of experiments that evaluate the relative merits of different pragmatic models, showing that they can improve the empirical communication accuracy both in typical referential game settings (Lazaridou et al., 2018) and in a Star Craft II simulation (Wang et al., 2019b). |
| Researcher Affiliation | Academia | Yipeng Kang, Tonghan Wang Institute for Interdisciplinary Information Sciences Tsinghua University Beijing, China {kyp13, wangth18}@mails.tsinghua.edu.cn Gerard de Melo Hasso Plattner Institute University of Potsdam Potsdam, Germany gdm@demelo.org |
| Pseudocode | No | The paper describes methods using mathematical formulas and prose, but no structured pseudocode or algorithm blocks were found. |
| Open Source Code | Yes | Our code is freely available online.3 https://fringsoo.github.io/pragmatic_in2_emergent_papersite/ |
| Open Datasets | No | The paper describes a generated dataset ('We generated Mu Jo Co-like objects using Py Bullet (Coumans and Bai, 2016 2020), with 8 possible colors and 5 possible shapes...'), but does not provide access information (link, DOI, or citation to a public repository) for this specific dataset. PyBullet is the tool used for generation, not the dataset itself. |
| Dataset Splits | No | The paper mentions 'training set' and 'test set' but does not explicitly provide details about a separate validation set or its split. |
| Hardware Specification | Yes | On AWS t3.xlarge (4 CPU 16G Memory), the training takes about 1 day and the total time to test all methods takes about 1 hour. The training was conducted on an AWS g4dn.xlarge GPU (NVIDIA T4 Tensor Core GPU) instance and took about 20 hours. |
| Software Dependencies | No | The paper mentions 'Py Bullet (Coumans and Bai, 2016 2020)' which indicates a version range, but it doesn't specify other key software dependencies (e.g., machine learning frameworks like TensorFlow or PyTorch) with version numbers. |
| Experiment Setup | Yes | messages have a maximum length of 5 symbols and the alphabet size is 17. The long-term training processes 1,000,000 instances. We invoke two separately pretrained Alex Net CNNs. agent actions are sampled from PS0 and PL0 and these distributions are also penalized by an entropy term if they are not sufficiently evenly distributed at an early stage. For one testing epoch, each of the 1,000 test objects serves as the target once, while distractors in each round are sampled randomly. We ran 30 million training steps and update the model by sampling 8 episodes from a replay buffer in each step. |