Non-parametric Online Learning from Human Feedback for Neural Machine Translation
Authors: Dongqi Wang, Haoran Wei, Zhirui Zhang, Shujian Huang, Jun Xie, Jiajun Chen11431-11439
AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments conducted on EMEA and JRC-Acquis benchmarks demonstrate that our proposed method obtains substantial improvements on translation accuracy and achieves better adaptation performance with less repeating human correction operations. |
| Researcher Affiliation | Collaboration | 1National Key Laboratory for Novel Software Technology, Nanjing University, China 2Machine Intelligence Technology Lab, Alibaba DAMO Academy wangdq@smail.nju.edu.cn, {huangsj, chenjj}@nju.edu.cn {funan.whr, zhirui.zzr, qingjing.xj}@alibaba-inc.com |
| Pseudocode | No | No pseudocode or algorithm block was found in the paper. |
| Open Source Code | Yes | Our code is open-sourced at https://github.com/wangqi1996/Ko K. |
| Open Datasets | Yes | We conduct the experiments on two specific domain datasets from OPUS (Tiedemann 2012), which are widely employed by previous works (Zheng et al. 2021a; Cai et al. 2021): (1) European Medicines Agency (EMEA) dataset1 (Tiedemann 2009), which consists of sentencealigned documents focusing on medical products. (2) JRCAcquis corpus (Steinberger et al. 2006), which contains the European Union laws applicable to the EU member states. |
| Dataset Splits | Yes | Specifically, we divide the documents into five buckets based on their length (0-50, 50-100, 100-200, 200500 and 500-1000). We randomly select some documents so that the total document length of each bucket is upper than 1000. Detailed for EMEA/JRC dataset statistics are shown in Table 1. |
| Hardware Specification | No | The latency is measured on a single Ge Force GTX 1080-Ti GPU. (This specifies the GPU model, which is sufficient.) |
| Software Dependencies | No | We apply the FAIRSEQ3 (Ott et al. 2019) toolkit for NMT implementation, and Faiss4(Johnson, Douze, and J egou 2017) with Exact Search for L2 setting for efficient KNN retrieval. Following the previous experiences (Zheng et al. 2021a; Khandelwal et al. 2020), we employ the WMT19 German-English news translation task winner model (Ng et al. 2019) as the pre-trained model. No version numbers are provided for FAIRSEQ, Faiss, or the NMT model. |
| Experiment Setup | Yes | Following the previous experiences (Zheng et al. 2021a; Khandelwal et al. 2020), we employ the WMT19 German-English news translation task winner model (Ng et al. 2019) as the pre-trained model. The K for Token-KNN and Policy-KNN is 8. |