reproducibilityindex.ai

Phrase Table as Recommendation Memory for Neural Machine Translation

Authors: Yang Zhao, Yining Wang, Jiajun Zhang, Chengqing Zong

IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The extensive experiments demonstrate that the proposed methods obtain remarkable improvements over the strong attention-based NMT. Our empirical experiments on Chinese-English translation and English-Japanese translation tasks show the efﬁcacy of our methods.
Researcher Affiliation	Academia	Yang Zhao 1,2, Yining Wang 1,2, Jiajun Zhang 1,2,4 and Chengqing Zong 1,2,3 1 National Laboratory of Pattern Recognition, Institute of Automation, CAS, Beijing, China 2 University of Chinese Academy of Sciences, Beijing, China 3 CAS Center for Excellence in Brain Science and Intelligence Technology, Beijing, China 4 Beijing Advanced Innovation Center for Language Resources, Beijing, China
Pseudocode	Yes	Algorithm 1 Construct recommendation word set
Open Source Code	No	We use the Zoph RNN toolkit5 to implement all our described methods. 5https://github.com/isi-nlp/Zoph RNN. We extend this toolkit with global attention. The paper states they used and extended an existing toolkit, but does not state that their own modifications or full implementation code for the described method are open source.
Open Datasets	Yes	In CH-EN translation, we test the proposed methods with two data sets: ... NIST 2003 (MT03) dataset is used for validation. NIST2004-2006 (MT04-06) and NIST 2008 (MT08) datasets are used for testing. In EN-JA translation, we use KFTT dataset4, which includes 0.44M sentence pairs for training, 1166 sentence pairs for validation and 1160 sentence pairs for testing. 3LDC2000T50, LDC2002L27, LDC2002T01, LDC2002E18, LDC2003E07, LDC2003E14, LDC2003T17, LDC2004T07. 4http://www.phontron.com/kftt/.
Dataset Splits	Yes	NIST 2003 (MT03) dataset is used for validation. In EN-JA translation, we use KFTT dataset4, which includes 0.44M sentence pairs for training, 1166 sentence pairs for validation and 1160 sentence pairs for testing.
Hardware Specification	No	The paper mentions using 'Zoph RNN toolkit' and setting 'word embedding dimension and the size of hidden layers' and 'minibatch size', but does not specify any hardware components like CPU, GPU models, or memory.
Software Dependencies	No	The paper mentions using 'Zoph RNN toolkit' and 'Moses' for phrase table learning, but does not provide specific version numbers for these or any other software dependencies like programming languages, libraries, or frameworks.
Experiment Setup	Yes	The word embedding dimension and the size of hidden layers are both set to 1,000. The minibatch size is set to 128. We limit the vocabulary to 30K most frequent words for both the source and target languages. Other words are replaced by a special symbol UNK . At test time, we employ beam search and beam size is set to 12.