reproducibilityindex.ai

Variational Recurrent Neural Machine Translation

Authors: Jinsong Su, Shan Wu, Deyi Xiong, Yaojie Lu, Xianpei Han, Biao Zhang

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on Chinese-English and English-German translation tasks demonstrate that the proposed model achieves signiﬁcant improvements over both the conventional and variational NMT models.
Researcher Affiliation	Academia	Xiamen University, Xiamen, China1 Institute of Software, Chinese Academy of Sciences, Beijing, China2 Soochow University, Suzhou, China3
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	No	The paper mentions open-source or re-implemented systems for comparison (Moses, DL4MT tutorial) but does not provide concrete access to its own source code for the methodology described.
Open Datasets	Yes	Our Chinese-English training data consists of 1.25M LDC sentence pairs... In English-German translation, our training data consists of 4.46M sentence pairs... We used the NIST MT02 dataset... and the NIST MT03/04/05/06 datasets... We used the news-test 2013 as the validation set and the news-test 2015 as the test set.
Dataset Splits	Yes	We used the NIST MT02 dataset as the validation set... We used the news-test 2013 as the validation set
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments, only general training parameters.
Software Dependencies	No	The paper mentions using Rmsprop and the Moses script but does not provide specific version numbers for these or any other software libraries or frameworks.
Experiment Setup	Yes	We applied Rmsprop (Graves 2013) with iter Num=5, momentum=0, ρ=0.95, and ϵ=1 10 4 to train various NMT models... Speciﬁcally, we set word embedding dimension as 620, hidden layer size as 1000, learning rate as 5 10 4, batch size as 80, gradient norm as 1.0, and dropout rate as 0.3. Particularly, we initialized the parameters of VRNMT with the trained conventional NMT model. As implemented in VAE, we set the sampling number L=1, and d e=dz=2df=2000 according to preliminary experiments. During decoding, we used the beam-search algorithm, and set beam sizes of all models as 10.