Incorporating Pragmatic Reasoning Communication into Emergent Language

Authors: Yipeng Kang, Tonghan Wang, Gerard de Melo

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 4 Experiments. 4.1 Experimental Setup. We conduct a series of experiments that evaluate the relative merits of different pragmatic models, showing that they can improve the empirical communication accuracy both in typical referential game settings (Lazaridou et al., 2018) and in a Star Craft II simulation (Wang et al., 2019b).
Researcher Affiliation Academia Yipeng Kang, Tonghan Wang Institute for Interdisciplinary Information Sciences Tsinghua University Beijing, China {kyp13, wangth18}@mails.tsinghua.edu.cn Gerard de Melo Hasso Plattner Institute University of Potsdam Potsdam, Germany gdm@demelo.org
Pseudocode No The paper describes methods using mathematical formulas and prose, but no structured pseudocode or algorithm blocks were found.
Open Source Code Yes Our code is freely available online.3 https://fringsoo.github.io/pragmatic_in2_emergent_papersite/
Open Datasets No The paper describes a generated dataset ('We generated Mu Jo Co-like objects using Py Bullet (Coumans and Bai, 2016 2020), with 8 possible colors and 5 possible shapes...'), but does not provide access information (link, DOI, or citation to a public repository) for this specific dataset. PyBullet is the tool used for generation, not the dataset itself.
Dataset Splits No The paper mentions 'training set' and 'test set' but does not explicitly provide details about a separate validation set or its split.
Hardware Specification Yes On AWS t3.xlarge (4 CPU 16G Memory), the training takes about 1 day and the total time to test all methods takes about 1 hour. The training was conducted on an AWS g4dn.xlarge GPU (NVIDIA T4 Tensor Core GPU) instance and took about 20 hours.
Software Dependencies No The paper mentions 'Py Bullet (Coumans and Bai, 2016 2020)' which indicates a version range, but it doesn't specify other key software dependencies (e.g., machine learning frameworks like TensorFlow or PyTorch) with version numbers.
Experiment Setup Yes messages have a maximum length of 5 symbols and the alphabet size is 17. The long-term training processes 1,000,000 instances. We invoke two separately pretrained Alex Net CNNs. agent actions are sampled from PS0 and PL0 and these distributions are also penalized by an entropy term if they are not sufficiently evenly distributed at an early stage. For one testing epoch, each of the 1,000 test objects serves as the target once, while distractors in each round are sampled randomly. We ran 30 million training steps and update the model by sampling 8 episodes from a replay buffer in each step.