Neural Response Generation With Dynamic Vocabularies

Authors: Yu Wu, Wei Wu, Dejian Yang, Can Xu, Zhoujun Li

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on both automatic metrics and human annotations show that DVS2S can significantly outperform state-of-the-art methods in terms of response quality, but only requires 60% decoding time compared to the most efficient baseline.
Researcher Affiliation Collaboration Yu Wu, Wei Wu, Dejian Yang, Can Xu, Zhoujun Li, State Key Lab of Software Development Environment, Beihang University, Beijing, China Microsoft Research, Beijing, China
Pseudocode Yes Algorithm 1: Optimization Algorithm
Open Source Code No The paper provides links to the code for baseline models (S2SA-MMI, TA-S2S, CVAE) and mentions implementing their model using Theano, but does not provide a link or explicit statement about releasing the source code for their proposed DVS2S model.
Open Datasets Yes We use the data in (Xing et al. 2016) which consists of message-response pairs crawled from Baidu Tieba.
Dataset Splits Yes There are 5 million pairs in the training set, 10, 000 pairs in the validation set, and 1, 000 pairs in the test set.
Hardware Specification Yes The efficiency comparison is conducted on both a GPU environment with a single Tesla K80 and a CPU environment with 6 Intel Xeon CPUs E52690 @ 2.6GHz.
Software Dependencies Yes We implement our model using Theano (Theano Development Team 2016).
Experiment Setup Yes In our model, we set the word embedding size as 620 and the hidden vector size as 1024 in both encoding and decoding. In the Monte Carlo sampling, we set the number of samples S as 5. We set the beam size as 20 and use the top one response from beam search in evaluation. We employ Ada Delta algorithm (Zeiler 2012) to train our model with a batch size 64. We set the initial learning rate as 1.0 and reduce it by half if perplexity on validation begins to increase.