Lexically Constrained Neural Machine Translation with Explicit Alignment Guidance
Authors: Guanhua Chen, Yun Chen, Victor O.K. Li12630-12638
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on WMT16 De-En and WMT16 Ro En show the effectiveness of our approaches on constrained NMT. In particular, the proposed EAM-OUTPUT method consistently outperforms previous approaches in translation quality, with light computational overheads over unconstrained baseline. |
| Researcher Affiliation | Academia | 1 The University of Hong Kong 2 Shanghai University of Finance and Economics |
| Pseudocode | No | The paper describes the decoding process and methods verbally and with a diagram (Figure 1), but does not provide pseudocode or a clearly labeled algorithm block. |
| Open Source Code | Yes | Code is public at https://github.com/ghchen18/cdalign |
| Open Datasets | Yes | Models are trained on WMT16 De-En and WMT16 Ro-En training set and evaluated on alignment testset and WMT news translation testset. ... For the alignment testset, we use the handaligned, publicly available alignment testset for De-En4 and Ro-En5. 4https://www-i6.informatik.rwth-aachen.de/goldAlignment 5http://web.eecs.umich.edu/ mihalcea/wpt/index.html#resources |
| Dataset Splits | Yes | We use newstest2013 and newsdev2016 as development sets for De-En and Ro En respectively. |
| Hardware Specification | Yes | The decoding speed is tested on a single Ge Force RTX 2080Ti GPU. |
| Software Dependencies | Yes | Model is implemented on fairseq toolkit6 (Ott et al. 2019). ... We report case-sensitive BLEU score using sacre BLEU7 (Post 2018). BLEU+case.mixed+numrefs.1+smooth.exp+tok.13a +version.1.4.3 |
| Experiment Setup | Yes | The learning rate is 0.0005 and warmup step is 4000. All the dropout probabilities are set to 0.3. The batch size is 32k tokens. Maximum updates number is 100k for the De-En language pair and 50k for the Ro-En language pair. For training the EAM, the maximum updates number is 10k. |