Emergent Translation in Multi-Agent Communication

Authors: Jason Lee, Kyunghyun Cho, Jason Weston, Douwe Kiela

ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We compare our model against a number of baselines, including a nearest neighbor method and a recently proposed model (Nakayama & Nishida, 2017) that maps languages and images to a shared space, but lacks communication. We evaluate performance on both wordand sentence-level translation, and show that our model outperforms the baselines in both settings. Additionally, we show
Researcher Affiliation Collaboration Jason Lee New York University jason@cs.nyu.edu Kyunghyun Cho New York University Facebook AI Research kyunghyun.cho@nyu.edu Jason Weston Facebook AI Research jase@fb.com Douwe Kiela Facebook AI Research dkiela@fb.com
Pseudocode No No pseudocode or clearly labeled algorithm blocks are present in the paper. The model architecture and training process are described in prose and mathematical equations.
Open Source Code No No explicit statement or link providing open-source code for the methodology described in this paper was found.
Open Datasets Yes We use the Bergsma500 dataset (Bergsma & Van Durme, 2011)... The Multi30k (Elliott et al., 2016) dataset... We use MS COCO (Lin et al., 2014; Chen et al., 2015), which contains 120k images and 5 English captions per image, and STAIR (Yoshikawa et al., 2017), a collection of Japanese annotations of the same dataset (also 5 per image).
Dataset Splits Yes We train on 80% of the images, and choose the model with the best communication accuracy on the 20% validation set when reporting translation performance. ... We use the original data split: 29k training, 1k validation and 1k test images. ... Following Karpathy & Li (2015), we use 110k training, 5k validation and 5k test images.
Hardware Specification No The paper does not provide specific details about the hardware used to run the experiments, such as GPU models, CPU types, or memory specifications.
Software Dependencies No The paper mentions several software components like "Adam optimizer", "pre-trained Res Net with 50 layers", "Moses", "byte pair encoding (BPE) algorithm", "Gumbel-softmax", and "REINFORCE", but it does not specify any version numbers for these, which is required for reproducibility.
Experiment Setup Yes We train with 1 distractor (K = 2)2, learning rate 3e 4, and minibatch size 128. The embedding and hidden state dimensionalities are set to 400. ... We train with 1 distractor (K = 2)5 and minibatch size 64. The hidden state size and embedding dimensionalities are 1024 and 512, respectively. The learning rate and dropout rate are tuned on the validation set for each task.