Knowledge Graphs Enhanced Neural Machine Translation
Authors: Yang Zhao, Jiajun Zhang, Yu Zhou, Chengqing Zong
IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The extensive experiments on Chinese-to-English and English-to-Japanese translation tasks demonstrate that our method significantly outperforms the strong baseline models in translation quality, especially in handling the induced entities. |
| Researcher Affiliation | Collaboration | Yang Zhao1,2 , Jiajun Zhang1,2 , Yu Zhou1,4 and Chengqing Zong1,2,3 1National Laboratory of Pattern Recognition, Institute of Automation, CAS, Beijing, China 2School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, China 3CAS Center for Excellence in Brain Science and Intelligence Technology, Beijing, China 4Beijing Fanyu Technology Co., Ltd, Beijing, China |
| Pseudocode | Yes | Algorithm 1 Bilingual K D Entities Induction Method Input: Parallel sentence pairs D; source KG KGs; target KG KGt; pre-defined hyper-parameter δ Output: Bilingual K D entities induction set I Algorithm: |
| Open Source Code | No | The paper mentions implementing the NMT model based on the THUMT toolkit and the knowledge embedding method based on the Open KE toolkit, providing links to these third-party tools, but does not provide open-source code specifically for their proposed method or its unique components. |
| Open Datasets | Yes | The CN EN parallel sentence pairs are extracted from LDC corpus, which contains 2.01M sentence pairs. On CN EN task, we utilize three different KGs: i) Medical KG, where the source KG contains 0.38M triples3 and the target KG contains 0.23M triples, which are filtered from YAGO [Suchanek et al., 2007]. We construct 2000 medical sentence pairs as development set and 2000 medical sentence pairs as test set. ii) Tourism KG, where the source KG contains 0.16M triples. The target KG contains 0.28M triples, which are also filtered from YAGO4. We also construct 2000 sentence pairs on tourism as development set, and 2000 other sentence pairs as test set. iii) General KG, where the source KG is randomly selected from CN-DBpedia5 and the target KG is randomly selected from YAGO. We choose the NIST 03 as development set and NIST 04-06 as test set. We use KFTT dataset as EN JA parallel sentence pairs. The source and target KGs are DBP15K from [Sun et al., 2017]. |
| Dataset Splits | Yes | We construct 2000 medical sentence pairs as development set and 2000 medical sentence pairs as test set. [...] We also construct 2000 sentence pairs on tourism as development set, and 2000 other sentence pairs as test set. [...] We choose the NIST 03 as development set and NIST 04-06 as test set. |
| Hardware Specification | No | The paper does not specify any hardware used for experiments, such as GPU/CPU models or specific machine configurations. |
| Software Dependencies | No | The paper mentions using the THUMT toolkit and Open KE toolkit but does not specify their version numbers or other software dependencies with version details. |
| Experiment Setup | Yes | where we set the hyper-parameter δ (Algorithm 1) by 0.45 (Medical), 0.47 (Tourism), 0.39 (General) and 0.43 (DBP15K) and λ (Section 4.2) by 0.86 (Medical), 0.82 (Tourism), 0.73 (General) and 0.82 (DBP15K). The oversample time n (Section 4.3) is set to 4 (Medical), 3 (Tourism), 2 (General) and 3 (DBP15K), respectively. All these hyperparameters are fine-tuned in development set. We use the base version parameters of the Transformer model. On all translation tasks, we use the BPE [Sennrich et al., 2016] method to merge 30K steps. |