reproducibilityindex.ai

Improved Neural Machine Translation with Source Syntax

Authors: Shuangzhi Wu, Ming Zhou, Dongdong Zhang

IJCAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our method on publicly available data sets with Chinese English and English-Japanese translation tasks. Experimental results on Chinese-English task show that our model significantly improves translation accuracy over the conventional NMT and SMT baseline systems.
Researcher Affiliation	Collaboration	Harbin Institute of Technology, Harbin, China Microsoft Research {v-shuawu, mingzhou, dozhang}@microsoft.com
Pseudocode	No	No structured pseudocode or algorithm blocks were found in the paper.
Open Source Code	No	The paper does not explicitly state that source code for their method is made publicly available.
Open Datasets	Yes	We conduct experiments on the Chinese-English translation task as well as the English-Japanese translation task where the same data set from WAT 2016 ASPEC corpus [Nakazawa et al., 2016] 1 is used for a fair comparison with other work. In the Chinese-English translation task, the bilingual training data consists of a set of LDC datasets 2.
Dataset Splits	Yes	The development data set is NIST2003, and the testing data are NIST2005, NIST2006, NIST2008 and NIST2012 evaluation sets. The development data contains 1,790 sentences, and the test data contains 1,812 sentences with single reference per source sentence. Five groups of sentences are collected on the Japanese test set and the merged Chinese test set of NIST 2005, NIST 2006, NIST 2008 and NIST 2012, where source length ranges are {20-, 20-30, 30-40, 40-50, 50+}. The statistic of the ﬁve groups is shown in Table 3.
Hardware Specification	Yes	All model parameters are initialized randomly with Gaussian distribution and trained on a NVIDIA Tesla K40 GPU.
Software Dependencies	No	The paper mentions software like 'Ky Tea' and 'Adadelta algorithm' but does not specify their version numbers.
Experiment Setup	Yes	The size of word embeddings is set to 512 for both tasks. The dimensions of hidden states for all RNNs are set to 1024. The stochastic gradient descent (SGD) algorithm is used to tune parameters with a learning rate of 1.0 and a batch size of 128. We use the beam search strategy for decoding with a beam size of 12.