Towards Zero Unknown Word in Neural Machine Translation
Authors: Xiaoqing Li, Jiajun Zhang, Chengqing Zong
IJCAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on Chinese-to-English translation demonstrate that our proposed method can achieve more than 4 BLEU points over the attention-based NMT. |
| Researcher Affiliation | Academia | National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences CAS Center for Excellence in Brain Science and Intelligence Technology |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any concrete access to source code for the methodology described. |
| Open Datasets | Yes | The bilingual data to train the NMT model is selected from LDC, which contains about 0.6M sentence pairs. ... We use the word2vec toolkit [Mikolov et al., 2013] to train word vectors on the monolingual data, which is the combination of the source side of the bilingual data and Chinese Giagaword Xinhua portion. ... the English language model is trained on the combination of the target side of the bilingual data and the English Gigaword. |
| Dataset Splits | Yes | The NIST 03 dataset is chosen as the development set, which is used to monitoring the training process and decide the early stop condition. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running its experiments. |
| Software Dependencies | No | The paper mentions software tools like 'Berkeley Aligner', 'word2vec toolkit', and 'kenlm', but does not provide specific version numbers for them. |
| Experiment Setup | Yes | We limit both the source and target vocabulary to 30k in our experiments. This number of hidden units is 1,000 for both the encoder and decoder. And the word embedding dimension is 500 for all source and target words. The parameters in the network are updated with the adadelta algorithm. |