reproducibilityindex.ai

Search Engine Guided Neural Machine Translation

Authors: Jiatao Gu, Yong Wang, Kyunghyun Cho, Victor O.K. Li

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirical evaluation on three language pairs (En-Fr, En-De, and En-Es) shows that the proposed approach signiﬁcantly outperforms the baseline approach and the improvement is more signiﬁcant when more relevant sentence pairs were retrieved.
Researcher Affiliation	Academia	The University of Hong Kong New York University, CIFAR Azrieli Global Scholar
Pseudocode	Yes	Algorithm 1 Greedy selection procedure to maximize the coverage of the source symbols. and Algorithm 2 Learning for SEG-NMT
Open Source Code	No	The paper mentions using Apache Lucene and provides its URL, but does not state that the authors' own code for the described methodology is open-source.
Open Datasets	Yes	We use the JRC-Acquis corpus(Steinberger et al. 2006) for evaluating the proposed SEG-NMT model.3 The JRC-Acquis corpus consists of the total body of European Union (EU) law applicable to the member states. 3http://optima.jrc.it/Acquis/JRC-Acquis.3.0/corpus/
Dataset Splits	Yes	For each language pair, we uniformly select 3000 sentence pairs at random for both the development and test sets. The rest is used as a training set, after removing any sentence which contains special characters only.
Hardware Specification	No	The paper does not explicitly describe the specific hardware (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies	No	The paper mentions software like Apache Lucene, Adam optimizer, GRU, and BPE, but does not specify version numbers for any of the key software components used for implementation (e.g., PyTorch, Python versions).
Experiment Setup	Yes	We use a standard attention-based neural machine translation model(Bahdanau, Cho, and Bengio 2014) with 1,024 gated recurrent units(GRU)(Cho et al. 2014) on each of the encoder and decoder. We train both the vanilla model as well as the proposed SEG-NMT based on this conﬁguration from scratch using Adam(Kingma and Ba 2014) with the initial learning rate set to 0.001. We use a minibatch of up to 32 sentence pairs. For evaluation, we use beam search with width set to 5. In the case of the proposed SEG-NMT, we parametrize the metric matrix M in the similarity score function from Eq. (7) to be diagonal and initialized to an identity matrix. λ in Eq. (7) is initialized to 0. The gating network fgate is a feedforward network with a single hidden layer, just like the attention mechanism fatt.