Modeling Topical Relevance for Multi-Turn Dialogue Generation

Authors: Hainan Zhang, Yanyan Lan, Liang Pang, Hongshen Chen, Zhuoye Ding, Dawei Yin

IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on both Chinese customer services data and English Ubuntu dialogue data show that STAR-BTM significantly outperforms several state-of-the-art methods, in terms of both metric-based and human evaluations.
Researcher Affiliation Collaboration Hainan Zhang2 , Yanyan Lan14 , Liang Pang14 , Hongshen Chen2 , Zhuoye Ding2 and Dawei Yin4 1CAS Key Lab of Network Data Science and Technology, Institute of Computing Technology, CAS 2JD.com, Beijing, China 3Baidu.com, Beijing, China 4University of Chinese Academy of Sciences
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes We run all models on the Tesla K80 GPU with Tensorflow.3 3https://github.com/zhanghainan/STAR-BTM
Open Datasets Yes The Chinese customer service dataset, named JDC, consists of 515,686 history-response pairs published by the JD contest 1. We randomly divided the corpus into training, validation and testing, each contains 500,000, 7843, and 7843 pairs, respectively. The Ubuntu conversation dataset 2 is extracted from the Ubuntu Q&A forum, called Ubuntu [Lowe et al., 2015].
Dataset Splits Yes We randomly divided the corpus into training, validation and testing, each contains 500,000, 7843, and 7843 pairs, respectively. ... Finally, we obtain 3,980,000, 10,000, and 10,000 history-response pairs for training, validation and testing, respectively.
Hardware Specification Yes We run all models on the Tesla K80 GPU with Tensorflow.
Software Dependencies No The paper mentions TensorFlow and the Jieba tool but does not specify their version numbers.
Experiment Setup Yes For JDC, the Jieba tool is utilized for Chinese word segmentation, and its vocabulary size is set to 68,521. For Ubuntu, we set the vocabulary size to 15,000. To fairly compare our model with all baselines, the number of hidden nodes is all set to 512 and the batch size set to 32. The max length of sentence is set to 50 and the max number of dialogue turns is set to 15. The number of topics in BTM is set to 8. We use the Adam for gradient optimization in our experiments. The learning rate is set to 0.0001.