Coherent Dialogue with Attention-Based Language Models
Authors: Hongyuan Mei, Mohit Bansal, Matthew Walter
AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate the model on two popular dialogue datasets, the open-domain Movie Triples dataset and the closed-domain Ubuntu Troubleshoot dataset, and achieve significant improvements over the state-of-the-art and baselines on several metrics, including complementary diversity-based metrics, human evaluation, and qualitative visualizations. We also show that a vanilla RNN with dynamic attention outperforms more complex memory models (e.g., LSTM and GRU) by allowing for flexible, long-distance memory. |
| Researcher Affiliation | Academia | Hongyuan Mei Johns Hopkins University hmei@cs.jhu.edu, Mohit Bansal UNC Chapel Hill mbansal@cs.unc.edu, Matthew R. Walter TTI-Chicago mwalter@ttic.edu |
| Pseudocode | No | No structured pseudocode or algorithm blocks were found. The model is described using mathematical equations. |
| Open Source Code | No | No concrete access to source code (e.g., repository link, explicit statement of code release) for the described methodology was found. |
| Open Datasets | Yes | We train and evaluate the models on two large natural language dialogue datasets, Movie Triples (preprocessed by Serban et al. (2016)) and Ubuntu Troubleshoot (pre-processed by Luan, Ji, and Ostendorf (2016)). For the Movie Triples dataset, we follow the same procedure as Serban et al. (2016) and first pretrain on the large Q-A Sub Title dataset (Ameixa et al. 2014)... |
| Dataset Splits | No | The paper mentions using a 'development set' and 'held-out set' for model selection and early-stopping, and states 'We perform model selection using PPL on the development set'. It does not explicitly provide the split percentages or sample counts for training, validation, and test sets within the main body of the paper, deferring such details to an appendix. |
| Hardware Specification | No | The paper acknowledges 'NVIDIA Corporation for donating GPUs used in this research,' but does not provide specific details such as GPU models, CPU types, or memory configurations. |
| Software Dependencies | No | The paper mentions 'Adam (Kingma and Ba 2015) for optimization' but does not specify version numbers for any software dependencies or libraries. |
| Experiment Setup | No | The paper states 'The arxiv version s appendix provides additional training details, including the hyperparameter settings,' but these specific details are not present in the main text. |