Towards a Neural Conversation Model With Diversity Net Using Determinantal Point Processes

Authors: Yiping Song, Rui Yan, Yansong Feng, Yaoyuan Zhang, Dongyan Zhao, Ming Zhang

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments show that our model achieves the best performance among various baselines in terms of both quality and diversity. We evaluated our approach on a massive Chinese conversation dataset crawled from Baidu Tieba. There were 1,600,000 query-reply pairs for training, 2000 pairs for validation, and another unseen 2000 pairs for testing.
Researcher Affiliation Academia Yiping Song,1 Rui Yan,2,3 Yansong Feng,2 Yaoyuan Zhang,2 Dongyan Zhao,2,3 Ming Zhang1 1Institute of Network Computing and Information Systems, School of EECS, Peking University, China 2Institute of Computer Science and Technology, Peking University, China 3Beijing Institute of Big Data Research, China
Pseudocode Yes Algorithm 1: The DPP-D algorithm
Open Source Code No Codes and sample data will be soon available at: https://github.com/stellasyp/DPP-Conversational-System
Open Datasets No We evaluated our approach on a massive Chinese conversation dataset crawled from Baidu Tieba.2 [Footnote 2: http://tieba.baidu.com] - The paper mentions the source of the data but does not provide direct access to the processed dataset used for the experiments, nor a specific citation for it.
Dataset Splits Yes There were 1,600,000 query-reply pairs for training, 2000 pairs for validation, and another unseen 2000 pairs for testing.
Hardware Specification No The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory specifications, or cloud resources) used to run the experiments.
Software Dependencies No The paper does not provide specific software dependencies with version numbers (e.g., names of libraries or frameworks like PyTorch, TensorFlow, or specific Python versions).
Experiment Setup Yes To train the neural conversation models, we followed the hyperparameter settings in (Shang, Lu, and Li 2015; Song et al. 2016). The word embeddings were 610d and hidden layers were 1000d. We applied Ada Delta with default hyperparameters, where batch size is 80. We kept 100K words (Chinese terms) for queries, and 30K for replies due to efficiency concerns. The beam size k was 20...