Computational Language Acquisition with Theory of Mind

Authors: Andy Liu, Hao Zhu, Emmy Liu, Yonatan Bisk, Graham Neubig

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We build language-learning agents equipped with To M, and measure its effects on the learning process. We experiment with varying task difficulty, hypothesizing that models will acquire more complex language to adapt to stronger environmental pressures. We find that training speakers with a highly weighted To M listener component leads to performance gains in our image referential game setting. We also find some evidence that increasing task difficulty in the training process results in more fluent and precise utterances in evaluation. ... In experiments, we find that (RQ1) speaker models including To M components generally outperform those that do not in terms of fluency and final accuracy. We also find that (RQ2) training with more visually and semantically similar distractor referents causes the speaker model to develop longer, more fluent, and more precise utterances in evaluation.
Researcher Affiliation Academia Andy Liu Harvey Mudd College Claremont, CA, USA {ajliu}@g.hmc.edu; Hao Zhu, Emmy Liu, Yonatan Bisk, Graham Neubig Language Technologies Institute Carnegie Mellon University Pittsburgh, PA, USA {zhuhao, mengyan3, ybisk, gneubig}@cs.}cmu.edu
Pseudocode No The paper does not contain any pseudocode or clearly labeled algorithm blocks. It provides mathematical equations for model components and training objectives.
Open Source Code Yes Code and data can be found at https://github.com/neulab/To M-Language-Acquisition.
Open Datasets Yes Images and captions are drawn from the MS COCO dataset introduced in Lin et al. (2014).
Dataset Splits Yes The train-val-test split given by the MS COCO 2017 dataset is extended to our experimental setup.
Hardware Specification No The paper does not specify any hardware details such as GPU models, CPU types, or memory used for the experiments. It only refers to general model architectures like Res Net and LSTM.
Software Dependencies No The paper mentions several software components and models (e.g., Res Net, LSTM, PPO, GPT-2 large, spaCy, CLIP, RoBERTa, sentence-transformers) but does not provide specific version numbers for any of them.
Experiment Setup Yes The speaker generates a sequence of tokens to either the maximum length, which is set to 20 in our experiments, or an end-of-sequence token. In our experiments, we use a vocabulary size of 200... We introduce two thresholds, θ1 and θ2. ... We train models with three different settings of wl. We train models with wl = 0... wl = 1... Finally, we train models where wl is the arbitrarily high constant 1000. ... σ is set to decay linearly over time...