reproducibilityindex.ai

Generating Multiple Diverse Responses for Short-Text Conversation

Authors: Jun Gao, Wei Bi, Xiaojiang Liu, Junhui Li, Shuming Shi6383-6390

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on two short-text conversation tasks validate that the multiple responses generated by our model obtain higher quality and larger diversity compared with various state-of-the-art generative models.
Researcher Affiliation	Collaboration	1School of Computer Science and Technology, Soochow University, Suzhou, China imgaojun@gmail.com, lijunhui@suda.edu.cn 2Tencent AI Lab, Shenzhen, China {victoriabi, kieranliu, shumingshi}@tencent.com
Pseudocode	Yes	Algorithm 1: RL Training Algorithm
Open Source Code	Yes	All our code and datasets are available are https://ai.tencent.com/ailab/ nlp/dialogue.html.
Open Datasets	Yes	Weibo: We used the benchmark dataset (Shang, Lu, and Li 2015) and pre-processed it for high-quality data pairs. In total, we have above 4 million training pairs. Twitter: We crawled post-response pairs using the Twitter API. With several data cleaning steps, we have around 750 thousand training pairs. Both datasets have the vocabulary size 50,000. Details of the data preprocessing and statistics are in Appendix. All our code and datasets are available are https://ai.tencent.com/ailab/ nlp/dialogue.html.
Dataset Splits	No	The paper does not explicitly provide specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology) for a validation set, only mentioning training and testing.
Hardware Specification	No	The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies	No	The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers like Python 3.8, CPLEX 12.4) needed to replicate the experiment.
Experiment Setup	No	The training details including the keyword extraction, clustering and network conﬁgurations, are provided in Appendix.