reproducibilityindex.ai

Local Translation Prediction with Global Sentence Representation

Authors: Jiajun Zhang, Dakun Zhang, Jie Hao

IJCAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The large-scale experiments show that our method can obtain substantial improvements in translation quality over the strong baseline: the hierarchical phrase-based translation model augmented with the neural network joint model.
Researcher Affiliation	Collaboration	Jiajun Zhang , Dakun Zhang and Jie Hao National Laboratory of Pattern Recognition, CASIA, Beijing, China Toshiba (China) R&D Center jjzhang@nlpr.ia.ac.cn, {zhangdakun,haojie}@toshiba.com.cn
Pseudocode	No	The paper includes mathematical formulations and architectural diagrams (Figure 2 and Figure 4) but no explicit pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any explicit statements about releasing source code or links to a code repository for the methodology described.
Open Datasets	Yes	The bilingual training data 3 from LDC contains about 2.1 million sentence pairs. This bilingual data is also utilized to train the two neural networks. The 5-gram language model is trained on the English part of the bilingual training data and the Xinhua portion of the English Gigaword corpus. (Footnote 3 lists specific LDC dataset IDs: LDC2000T50, LDC2002L27, LDC2003E07, LDC2003E14, LDC2004T07, LDC2005T06, LDC2005T10 and LDC2005T34).
Dataset Splits	Yes	NIST MT03 is used as the tuning data. MT05, MT06 and MT08 (news data) are used as the test data.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU or CPU models, memory, or cloud instance types used for the experiments.
Software Dependencies	No	The paper mentions 'word2vec' and 'Noisy Contrastive Estimation (NCE)' but does not provide specific version numbers for these or any other software dependencies or libraries.
Experiment Setup	Yes	For the bilingually-constrained chunk-based CNN, the initial 192-dimensional word embeddings are trained with word2vec... We set the context window h = 3 for convolution. We will test multiple settings of the chunk number (C = 1, 2, 4, 8)... We apply L = 100 ﬁlters. The two fully connected linear layers both contain 192 neurons. The dropout ratio in the dropout layer is set 0.5 to prevent overﬁtting. The standard back-propagation and stochastic gradient descent (SGD) algorithm is utilized to optimize this network. For the feed-forward neural network, we also apply the SGD algorithm. In our experiments, following [Devlin et al., 2014] we use n = 4 and m = 11.