Towards Discourse-Aware Document-Level Neural Machine Translation
Authors: Xin Tan, Longyin Zhang, Fang Kong, Guodong Zhou
IJCAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct experiments on the document-level English-German and English Chinese translation tasks with three domains (TED, News, and Europarl). Experimental results show that our Disco2NMT model significantly surpasses both context-agnostic and context-aware baseline systems on multiple evaluation indicators. |
| Researcher Affiliation | Academia | Xin Tan , Longyin Zhang , Fang Kong and Guodong Zhou School of Computer Science and Technology, Soochow University, China {xtan9, lyzhang9}@stu.suda.edu.cn, {kongfang, gdzhou}@suda.edu.cn |
| Pseudocode | No | The paper describes methods using natural language and mathematical equations, but does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper provides links to datasets and third-party tools, but there is no explicit statement or link indicating that the source code for the Disco2NMT model itself is publicly available. |
| Open Datasets | Yes | Datasets. We conduct several experiments on the English German and English-Chinese translation tasks with corpora from the following three domains: TED (En-De/En Zh): For English-German, we use TED talks from IWSLT2017 [Cettolo et al., 2012] evaluation campaigns2 as the training corpus... News (En-De/En-Zh): For English-German, we take News Commentary V11 as our training corpus... Europarl (En De): The corpus are extracted from the Europarl V7 according to Maruf and Haffari [2018]. |
| Dataset Splits | Yes | For English-German, we use TED talks from IWSLT2017 [Cettolo et al., 2012] evaluation campaigns as the training corpus, tst2016-2017 as the test corpus, and the rest set as the development corpus. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used for running the experiments (e.g., GPU models, CPU types, memory). |
| Software Dependencies | No | The paper mentions software like 'Moses Toolkit', 'subword-nmt', 'Transformer', 'Adam optimizer', and 'Fasttext', but it does not specify version numbers for any of these components, which is required for reproducibility. |
| Experiment Setup | Yes | Specifically, the hidden size and filter size were set to 512 and 2048, respectively. Both encoder and decoder were composed of 6 hidden layers. The source and target vocabulary size were set to 30K. The beam size and dropout rate were set to 5 and 0.1, respectively. We used the Adam optimizer to train our model. |