Knowledge Graphs Enhanced Neural Machine Translation

Authors: Yang Zhao, Jiajun Zhang, Yu Zhou, Chengqing Zong

IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The extensive experiments on Chinese-to-English and English-to-Japanese translation tasks demonstrate that our method significantly outperforms the strong baseline models in translation quality, especially in handling the induced entities.
Researcher Affiliation Collaboration Yang Zhao1,2 , Jiajun Zhang1,2 , Yu Zhou1,4 and Chengqing Zong1,2,3 1National Laboratory of Pattern Recognition, Institute of Automation, CAS, Beijing, China 2School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, China 3CAS Center for Excellence in Brain Science and Intelligence Technology, Beijing, China 4Beijing Fanyu Technology Co., Ltd, Beijing, China
Pseudocode Yes Algorithm 1 Bilingual K D Entities Induction Method Input: Parallel sentence pairs D; source KG KGs; target KG KGt; pre-defined hyper-parameter δ Output: Bilingual K D entities induction set I Algorithm:
Open Source Code No The paper mentions implementing the NMT model based on the THUMT toolkit and the knowledge embedding method based on the Open KE toolkit, providing links to these third-party tools, but does not provide open-source code specifically for their proposed method or its unique components.
Open Datasets Yes The CN EN parallel sentence pairs are extracted from LDC corpus, which contains 2.01M sentence pairs. On CN EN task, we utilize three different KGs: i) Medical KG, where the source KG contains 0.38M triples3 and the target KG contains 0.23M triples, which are filtered from YAGO [Suchanek et al., 2007]. We construct 2000 medical sentence pairs as development set and 2000 medical sentence pairs as test set. ii) Tourism KG, where the source KG contains 0.16M triples. The target KG contains 0.28M triples, which are also filtered from YAGO4. We also construct 2000 sentence pairs on tourism as development set, and 2000 other sentence pairs as test set. iii) General KG, where the source KG is randomly selected from CN-DBpedia5 and the target KG is randomly selected from YAGO. We choose the NIST 03 as development set and NIST 04-06 as test set. We use KFTT dataset as EN JA parallel sentence pairs. The source and target KGs are DBP15K from [Sun et al., 2017].
Dataset Splits Yes We construct 2000 medical sentence pairs as development set and 2000 medical sentence pairs as test set. [...] We also construct 2000 sentence pairs on tourism as development set, and 2000 other sentence pairs as test set. [...] We choose the NIST 03 as development set and NIST 04-06 as test set.
Hardware Specification No The paper does not specify any hardware used for experiments, such as GPU/CPU models or specific machine configurations.
Software Dependencies No The paper mentions using the THUMT toolkit and Open KE toolkit but does not specify their version numbers or other software dependencies with version details.
Experiment Setup Yes where we set the hyper-parameter δ (Algorithm 1) by 0.45 (Medical), 0.47 (Tourism), 0.39 (General) and 0.43 (DBP15K) and λ (Section 4.2) by 0.86 (Medical), 0.82 (Tourism), 0.73 (General) and 0.82 (DBP15K). The oversample time n (Section 4.3) is set to 4 (Medical), 3 (Tourism), 2 (General) and 3 (DBP15K), respectively. All these hyperparameters are fine-tuned in development set. We use the base version parameters of the Transformer model. On all translation tasks, we use the BPE [Sennrich et al., 2016] method to merge 30K steps.