reproducibilityindex.ai

Empirical Regularization for Synthetic Sentence Pairs in Unsupervised Neural Machine Translation

Authors: Xi Ai, Bin Fang12471-12479

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our comprehensive experiments support that our method can generally improve the performance of currently successful models on three similar pairs {French, German, Romanian} English and one dissimilar pair Russian English with acceptably additional cost.
Researcher Affiliation	Academia	Xi Ai, Bin Fang College of Computer Science,Chongqing University barid.x.ai@gmail.com, fb@cqu.edu.cn
Pseudocode	Yes	Algorithm 1 Local Alignment
Open Source Code	No	We implement our experiments on Tensorﬂow 2.0 (Abadi et al. 2016) and will open our source code on Git Hub.
Open Datasets	Yes	Speciﬁcally, we ﬁrst retrieve monolingual corpora {French, German, English, Russian} from WMT 2018 4 (Bojar et al. 2018) including all available News Crawl datasets from 2007 through 2017 and monolingual corpora Romanian from WMT 2016 5 (Bojar et al. 2016) including News Crawl 2016.
Dataset Splits	No	The paper mentions specific test sets ('newstest2014', 'newstest2016') but does not provide explicit details about the train/validation splits (e.g., percentages, sample counts, or references to predefined splits) used from the WMT corpora for reproducing the experiments.
Hardware Specification	No	The paper does not provide specific details about the hardware used for running experiments, such as GPU or CPU models, memory specifications, or cloud computing instance types. It only generally refers to 'our machine' without specifics.
Software Dependencies	Yes	We implement our experiments on Tensorﬂow 2.0 (Abadi et al. 2016)
Experiment Setup	Yes	Adam optimizer (Kingma and Ba 2015) is used with parameters β1 = 0.9, β2 = 0.98, ϵ = 10 9 and a dynamic learning rate over the course of training (Vaswani et al. 2017) (warmup steps = 5000). We set dropout regularization with a drop rate rate = 0.1 and label smoothing with gamma = 0.1 (Mezzini 2018).