reproducibilityindex.ai

Multiresolution Recurrent Neural Networks: An Application to Dialogue Response Generation

Authors: Iulian Serban, Tim Klinger, Gerald Tesauro, Kartik Talamadupula, Bowen Zhou, Yoshua Bengio, Aaron Courville

AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In our experiments, the coarse sequences are extracted using automatic procedures, which are designed to capture compositional structure and semantics. We apply the models to dialogue response generation in the technical support domain and compare them with several competing models. The multiresolution recurrent neural networks outperform competing models by a substantial margin, achieving stateof-the-art results according to both a human evaluation study and automatic evaluation metrics.
Researcher Affiliation	Collaboration	Iulian Vlad Serban University of Montreal 2920 chemin de la Tour, Montr eal, QC, Canada
Pseudocode	No	No pseudocode or algorithm blocks are present in the paper.
Open Source Code	Yes	The pre-processed Ubuntu Dialogue Corpus and the coarse representations can be downloaded at http://www.iulianserban.com/ Files/Ubuntu Dialogue Corpus.zip and https://github.com/julianser/ Ubuntu-Multiresolution-Tools.
Open Datasets	Yes	The speciﬁc task we consider is technical support for the Ubuntu operating system; the data we use is the Ubuntu Dialogue Corpus developed by Lowe et al (2015). The pre-processed Ubuntu Dialogue Corpus and the coarse representations can be downloaded at http://www.iulianserban.com/ Files/Ubuntu Dialogue Corpus.zip and https://github.com/julianser/ Ubuntu-Multiresolution-Tools.
Dataset Splits	No	The models are trained using early stopping with patience based on the validation set log-likelihood. We choose model hyperparameters such as the number of hidden units, word embedding dimensionality, and learning rate based on the validation set log-likelihood. (No explicit split sizes or percentages are given).
Hardware Specification	No	No specific hardware details (e.g., GPU/CPU models, memory) used for running experiments are provided in the paper.
Software Dependencies	No	We implement all models in Theano (Theano Development Team 2016). (No specific version number for Theano is provided).
Experiment Setup	Yes	We train all models w.r.t. the log-likelihood or joint log-likelihood on the training set using Adam (Kingma and Ba 2015). The models are trained using early stopping with patience based on the validation set log-likelihood. We choose model hyperparameters such as the number of hidden units, word embedding dimensionality, and learning rate based on the validation set log-likelihood. We use gradient clipping to stop the parameters from exploding (Pascanu, Mikolov, and Bengio 2012). We deﬁne the 20,000 most frequent words as the vocabulary, and map all other words to a special unknown token. Based on several experiments, we ﬁx the word embedding dimensionality to size 300 for all models. At test time, we use a beam search of size 5 for generating the model responses. The RNNLM model has 2000 hidden units... The HRED model has 500, 1000, and 500 hidden units... Mr RNN... has 1000, 1000, and 2000 hidden units respectively for the coarse-level encoder, context, and decoder RNNs. The natural language sub-model... has 500, 1000, and 2000 hidden units... The coarse prediction encoder GRU RNN has 500 hidden units.