reproducibilityindex.ai

Translation Prediction with Source Dependency-Based Context Representation

Authors: Kehai Chen, Tiejun Zhao, Muyun Yang, Lemao Liu

AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Examined by a large-scale Chinese-English translation task, the proposed approach achieves a signiﬁcant improvement (of up to +1.9 BLEU points) over the baseline system, and meanwhile outperforms a number of context-enhanced comparison system.
Researcher Affiliation	Academia	1Machine Intelligence and Translation Laboratory, Harbin Institute of Technology, Harbin, China 2ASTREC, National Institute of Information and Communications Technology, Kyoto, Japan
Pseudocode	No	The paper describes the model architecture and training process using mathematical equations and textual descriptions but does not include explicit pseudocode or algorithm blocks.
Open Source Code	No	The paper mentions using external toolkits like Moses, SRILM, GIZA++, Stanford dependency parser, and word2vec, but does not provide access to the source code for the methodology described in this paper.
Open Datasets	Yes	The training data contains 1.46 million sentence pairs from the LDC dataset4. 4LDC2002E18, LDC2003E07, LDC2003E14, Hansards portion of LDC2004T07, LDC2004T08 and LDC2005T06.
Dataset Splits	Yes	The Minimum error rate training (MERT) (Och 2003) was used to optimize the feature weights on the NIST02 test set, and test on the NIST03/NIST04/NIST05 test set.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., CPU, GPU models) used for running the experiments.
Software Dependencies	No	The paper mentions software like Moses, srilm toolkit, GIZA++, Stanford dependency parser, and word2vec toolkit, but does not provide specific version numbers for these tools (e.g., "srilm toolkit 3" refers to footnote 3, not version 3).
Experiment Setup	Yes	Most models had a vocabulary size of 50k. We used word2vec toolkit to generate each word (100 dimensions) for historical DBi CUs, and each word (500 dimensions) for predicted DBi CU. These parameters were optimized by 10 epochs of stochastic gradient descent, using minibatch size 500 and a learning rate of 1.