reproducibilityindex.ai

Deliberation Networks: Sequence Generation Beyond One-Pass Decoding

Authors: Yingce Xia, Fei Tian, Lijun Wu, Jianxin Lin, Tao Qin, Nenghai Yu, Tie-Yan Liu

NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on neural machine translation and text summarization demonstrate the effectiveness of the proposed deliberation networks. On the WMT 2014 English-to-French translation task, our model establishes a new state-of-the-art BLEU score of 41.5.
Researcher Affiliation	Collaboration	1University of Science and Technology of China, Hefei, China 2Microsoft Research, Beijing, China 3Sun Yat-sen University, Guangzhou, China
Pseudocode	Yes	Algorithm 1: Algorithm to train the deliberation network
Open Source Code	No	The paper does not provide any specific links or explicit statements about releasing the source code for their methodology.
Open Datasets	Yes	For En Fr, we employ the standard ﬁltered WMT 14 dataset6... For Zh En, we choose 1.25M bilingual sentence pairs from LDC dataset... The training, validation and test sets for the task are extracted from Gigaword Corpus [6]
Dataset Splits	Yes	We concatenate newstest2012 and newstest2013 together as the validation set and use newstest2014 as the test set. For Zh En, we choose 1.25M bilingual sentence pairs from LDC dataset as training corpus, use NIST2003 as the validation set, and NIST2004, NIST2005, NIST2006, NIST2008 as the test sets.
Hardware Specification	Yes	All the models are trained on a single NVIDIA K40 GPU.
Software Dependencies	No	The paper mentions that the models are "implemented in Theano [24]" but does not specify a version number for Theano or other software dependencies.
Experiment Setup	Yes	The word embedding dimension is set as 620. For Zh En, we apply 0.5 dropout rate to the layer before softmax and no dropout is used in En Fr translation. ... Plain SGD is used as the optimizer in this process, with initial learning rate 0.2 and halving according to validation accuracy. To sample the intermediate translation output by the ﬁrst decoder, we use beam search with beam size 2, considering the tradeoff between accuracy and efﬁciency.