Agreement-Based Joint Training for Bidirectional Attention-Based Neural Machine Translation

Authors: Yong Cheng, Shiqi Shen, Zhongjun He, Wei He, Hua Wu, Maosong Sun, Yang Liu

IJCAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on Chinese English and English-French translation tasks show that agreement-based joint training significantly improves both alignment and translation quality over independent training.
Researcher Affiliation Collaboration #Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China; +Baidu Inc., Beijing, China; State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information Science and Technology Department of Computer Science and Technology, Tsinghua University, Beijing, China
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any concrete access to source code for the methodology described.
Open Datasets Yes For Chinese-English, the training corpus from LDC consists of 2.56M sentence pairs... For English-French, the training corpus from WMT 2014 consists of 12.07M sentence pairs...
Dataset Splits Yes We used the NIST 2006 dataset as the validation set for hyper-parameter optimization and model selection. The NIST 2002, 2003, 2004, 2005, and 2008 datasets were used as test sets. The concatenation of news-test-2012 and news-test-2013 was used as the validation set and news-test-2014 as the test set.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory amounts) used for running its experiments.
Software Dependencies No The paper mentions software like MOSES, RNNSEARCH, and SRILM, but does not provide specific version numbers for these or other software dependencies.
Experiment Setup Yes The vocabulary size is set to 30K for all languages. The hyper-parameter λ that balances the preference between likelihood and agreement is set to 1.0 for Chinese-English and 2.0 for English-French.