reproducibilityindex.ai

Unsupervised Neural Machine Translation with SMT as Posterior Regularization

Authors: Shuo Ren, Zhirui Zhang, Shujie Liu, Ming Zhou, Shuai Ma241-248

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments conducted on en-fr and en-de translation tasks show that our method significantly outperforms the strong baseline (Lample et al. 2018) and achieves the new state-of-the-art translation performance in unsupervised machine translation.
Researcher Affiliation	Collaboration	Shuo Ren,1 Zhirui Zhang,2 Shujie Liu,3 Ming Zhou,3 Shuai Ma1 1SKLSDE Lab, Beihang University 1Beijing Advanced Innovation Center for Big Data and Brain Computing, China 2University of Science and Technology of China, Hefei, China Ming Zhou,3Microsoft Research Asia
Pseudocode	Yes	Algorithm 1: Unsupervised NMT with SMT as PR
Open Source Code	No	The paper does not provide an explicit statement or link to its own open-source code for the methodology described.
Open Datasets	Yes	For each language, we use 50 million monolingual sentences in News Crawl, a monolingual dataset from WMT, which is the same as the previous work (Artetxe et al. 2017; Lample et al. 2018).
Dataset Splits	No	The paper mentions 'newstest 2014' and 'newstest 2016' as test sets but does not specify explicit training/validation/test splits, nor does it provide details about a validation set used for their NMT models.
Hardware Specification	No	The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies	No	The paper mentions several software tools like Moses, word2vec, vecmap, Transformer (via tensor2tensor), and Salm, but does not provide specific version numbers for any of them.
Experiment Setup	Yes	We share the vocabulary space of 50,000 BPE codes (Sennrich, Haddow, and Birch 2015) for source and target languages. For each language pair, we train two independent NMT models for different translation directions (i.e., source to target and target to source) with shared embedding layers of source and target sides. ... In that stage, there are three hyper parameters described in 3.2 that should be taken into account, i.e., the peakiness controller λ, the vocabulary size S or T, and the number of translation candidates k for each word.