reproducibilityindex.ai

Topic Aware Neural Response Generation

Authors: Chen Xing, Wei Wu, Yu Wu, Jie Liu, Yalou Huang, Ming Zhou, Wei-Ying Ma

AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirical studies on both automatic evaluation metrics and human annotations show that TA-Seq2Seq can generate more informative and interesting responses, signiﬁcantly outperforming state-of-the-art response generation models.
Researcher Affiliation	Collaboration	Chen Xing,1,2 Wei Wu,4 Yu Wu,3 Jie Liu,1,2 Yalou Huang,1,2 Ming Zhou,4 Wei-Ying Ma4 1College of Computer and Control Engineering, Nankai University, Tianjin, China 2College of Software, Nankai University, Tianjin, China 3State Key Lab of Software Development Environment, Beihang University, Beijing, China 4Microsoft Research, Beijing, China {v-chxing, wuwei, v-wuyu, mingzhou, wyma}@microsoft.com {jliu,huangyl}@nankai.edu.cn
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	Yes	We implemented the models with an open source deep learning tool Blocks4, and shared the code of our model at https: //github.com/Lynette Xing1991.
Open Datasets	No	We build a data set from Baidu Tieba which is the largest Chinese forum allowing users to post and comment on others posts. We crawl 20 million post-comment pairs and used them to simulate message-response pairs in conversation... In our experiments, we train a Twitter LDA model using large scale posts from Sina Weibo which is the largest microblogging service in China. The paper describes the data collection but does not provide concrete access information (link, DOI, repository, or formal citation with authors/year) for the specific datasets used in the experiments.
Dataset Splits	Yes	After this preprocessing, there are 15, 209, 588 pairs left. From them, we randomly sample 5 million distinct message-response pairs3 as training data, 10, 000 distinct pairs as validation data, and 1, 000 distinct messages with their responses as test data.
Hardware Specification	Yes	All models were initialized with isotropic Gaussian distributions X N(0, 0.01) and trained with an Ada Delta algorithm (Zeiler 2012) on a NVIDIA Tesla K40 GPU.
Software Dependencies	No	We implemented the models with an open source deep learning tool Blocks4. The Stanford Chinese word segmenter is also mentioned. However, specific version numbers for these software components are not provided.
Experiment Setup	Yes	We set the number of topics T as 200 and the hyperparameters of Twitter LDA as α = 1/T, β = 0.01, γ = 0.01... We set the dimensions of the hidden states of the encoder and the decoder as 1000, and the dimensions of word embeddings as 620. All models were initialized with isotropic Gaussian distributions X N(0, 0.01)... The batch size is 128. We set the initial learning rate as 1.0 and reduced it by half if the perplexity on validation began to increase.