Towards a Neural Conversation Model With Diversity Net Using Determinantal Point Processes
Authors: Yiping Song, Rui Yan, Yansong Feng, Yaoyuan Zhang, Dongyan Zhao, Ming Zhang
AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments show that our model achieves the best performance among various baselines in terms of both quality and diversity. We evaluated our approach on a massive Chinese conversation dataset crawled from Baidu Tieba. There were 1,600,000 query-reply pairs for training, 2000 pairs for validation, and another unseen 2000 pairs for testing. |
| Researcher Affiliation | Academia | Yiping Song,1 Rui Yan,2,3 Yansong Feng,2 Yaoyuan Zhang,2 Dongyan Zhao,2,3 Ming Zhang1 1Institute of Network Computing and Information Systems, School of EECS, Peking University, China 2Institute of Computer Science and Technology, Peking University, China 3Beijing Institute of Big Data Research, China |
| Pseudocode | Yes | Algorithm 1: The DPP-D algorithm |
| Open Source Code | No | Codes and sample data will be soon available at: https://github.com/stellasyp/DPP-Conversational-System |
| Open Datasets | No | We evaluated our approach on a massive Chinese conversation dataset crawled from Baidu Tieba.2 [Footnote 2: http://tieba.baidu.com] - The paper mentions the source of the data but does not provide direct access to the processed dataset used for the experiments, nor a specific citation for it. |
| Dataset Splits | Yes | There were 1,600,000 query-reply pairs for training, 2000 pairs for validation, and another unseen 2000 pairs for testing. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory specifications, or cloud resources) used to run the experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., names of libraries or frameworks like PyTorch, TensorFlow, or specific Python versions). |
| Experiment Setup | Yes | To train the neural conversation models, we followed the hyperparameter settings in (Shang, Lu, and Li 2015; Song et al. 2016). The word embeddings were 610d and hidden layers were 1000d. We applied Ada Delta with default hyperparameters, where batch size is 80. We kept 100K words (Chinese terms) for queries, and 30K for replies due to efficiency concerns. The beam size k was 20... |