An Ensemble of Retrieval-Based and Generation-Based Human-Computer Conversation Systems
Authors: Yiping Song, Cheng-Te Li, Jian-Yun Nie, Ming Zhang, Dongyan Zhao, Rui Yan
IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results show that such an ensemble system outperforms each single module by a large margin. |
| Researcher Affiliation | Academia | Institute of Network Computing and Information Systems, School of EECS, Peking University, China 2Department of Statistics, National Cheng Kung University, Taiwan 3University of Montreal, Canada 4Institute of Computer Science and Technology, Peking University, China {songyiping, mzhang cs, zhaody, ruiyan}@pku.edu.cn chengte@mail.ncku.edu.tw nie@iro.umontreal.ca |
| Pseudocode | No | The paper describes the model architecture and processes with text and diagrams but does not include any formal pseudocode or algorithm blocks. |
| Open Source Code | No | To train our neural models, we implement code based on dl4mt-tutorial6, and follow Shang et al. (2015) for hyperparameter settings as it generally works well in our model. 6https://github.com/nyu-dl/dl4mt-tutorial. The paper states that their code is based on a tutorial, but does not explicitly state that the code for their specific ensemble methodology is open-source or provided. |
| Open Datasets | Yes | To construct a database for information retrieval, we collected human-human utterances from massive online forums, microblogs, and question-answering communities, including Sina Weibo4 and Baidu Tieba.5 ... For the generation part, we use the dataset comprising 1,606,741 query-reply pairs originating from Baidu Tieba. |
| Dataset Splits | Yes | We randomly selected 1.5 million pairs for training and 100K pairs for validation. The left 6,741 pairs are used for testing both in generation part and the whole system. Table 2: Statistics of our datasets. Generator (Train) 1,500,000 Validation 100,000 Testing 6,741 |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., CPU, GPU models, memory) used for running its experiments. |
| Software Dependencies | No | To train our neural models, we implement code based on dl4mt-tutorial6, and follow Shang et al. (2015) for hyperparameter settings as it generally works well in our model. While the paper mentions using code based on a tutorial, it does not specify version numbers for any software dependencies like PyTorch, Python, or CUDA. |
| Experiment Setup | Yes | All the embeddings are set to 620-dimension and the hidden states are set to 1000-dimension. We apply Ada Delta with a minibatch ([Zeiler, 2012]) size of 80. Chinese word segmentation is performed on all utterances. We keep the set of 100k words for queries and 30K for the retrieval and generated replies due to efficiency concerns. The validation set is only used for early stop based on the perplexity measure. |