reproducibilityindex.ai

Modeling Topical Relevance for Multi-Turn Dialogue Generation

Authors: Hainan Zhang, Yanyan Lan, Liang Pang, Hongshen Chen, Zhuoye Ding, Dawei Yin

IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results on both Chinese customer services data and English Ubuntu dialogue data show that STAR-BTM significantly outperforms several state-of-the-art methods, in terms of both metric-based and human evaluations.
Researcher Affiliation	Collaboration	Hainan Zhang2 , Yanyan Lan14 , Liang Pang14 , Hongshen Chen2 , Zhuoye Ding2 and Dawei Yin4 1CAS Key Lab of Network Data Science and Technology, Institute of Computing Technology, CAS 2JD.com, Beijing, China 3Baidu.com, Beijing, China 4University of Chinese Academy of Sciences
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	Yes	We run all models on the Tesla K80 GPU with Tensorﬂow.3 3https://github.com/zhanghainan/STAR-BTM
Open Datasets	Yes	The Chinese customer service dataset, named JDC, consists of 515,686 history-response pairs published by the JD contest 1. We randomly divided the corpus into training, validation and testing, each contains 500,000, 7843, and 7843 pairs, respectively. The Ubuntu conversation dataset 2 is extracted from the Ubuntu Q&A forum, called Ubuntu [Lowe et al., 2015].
Dataset Splits	Yes	We randomly divided the corpus into training, validation and testing, each contains 500,000, 7843, and 7843 pairs, respectively. ... Finally, we obtain 3,980,000, 10,000, and 10,000 history-response pairs for training, validation and testing, respectively.
Hardware Specification	Yes	We run all models on the Tesla K80 GPU with Tensorﬂow.
Software Dependencies	No	The paper mentions TensorFlow and the Jieba tool but does not specify their version numbers.
Experiment Setup	Yes	For JDC, the Jieba tool is utilized for Chinese word segmentation, and its vocabulary size is set to 68,521. For Ubuntu, we set the vocabulary size to 15,000. To fairly compare our model with all baselines, the number of hidden nodes is all set to 512 and the batch size set to 32. The max length of sentence is set to 50 and the max number of dialogue turns is set to 15. The number of topics in BTM is set to 8. We use the Adam for gradient optimization in our experiments. The learning rate is set to 0.0001.