“Bilingual Expert” Can Find Translation Errors
Authors: Kai Fan, Jiayi Wang, Bo Li, Fengming Zhou, Boxing Chen, Luo Si6367-6374
AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The experimental results show that our approach achieves the state-of-the-art performance in most public available datasets of WMT 2017/2018 QE task. |
| Researcher Affiliation | Industry | Kai Fan, Jiayi Wang, Bo Li, Fengming Zhou, Boxing Chen, Luo Si Alibaba Group Inc. k.fan,joanne.wjy,shiji.lb,zfm104435,boxing.cbx,luo.si@alibaba-inc.com |
| Pseudocode | Yes | Algorithm 1 Translation Quality Estimation with Bi Transformer and Bi-LSTM |
| Open Source Code | No | The paper does not provide a link to its source code or explicitly state that the code for their methodology is open-source or publicly available. |
| Open Datasets | Yes | The data resources that we used for training the neural Bilingual Expert model are mainly from WMT1: (i) parallel corpora released for the WMT17/18 News Machine Translation Task, (ii) UFAL Medical Corpus and Khresmoi development data released for the WMT17/18 Biomedical Translation Task, (iii) src-pe pairs for the WMT17/18 QE Task. 1http://www.statmt.org/wmt18/ |
| Dataset Splits | Yes | We evaluate our algorithm on the testing data of WMT 2017/2018, and development data of CWMT 2018. For fair comparison, we tuned all the hyper-parameters of our model on the development data, and reported the corresponding results for the testing data. |
| Hardware Specification | Yes | The bilingual expert model is trained on 8 Nvidia P-100 GPUs for about 3 days until convergence. For translation QE model, we use only one layer Bi-LSTM, and it is trained on a single GPU. |
| Software Dependencies | No | The paper mentions software like 'scikit-learn' and 'CRFSuite toolkit' but does not specify their version numbers or other required software dependencies with versions. |
| Experiment Setup | Yes | The number of layers in the bidirectional transformer for each module is 2, and the number of hidden units for feedforward sub-layer is 512. We use the 8-head self-attention in practice, since the single one is just a weighted average of previous layers. For translation QE model, we use only one layer Bi-LSTM... |