reproducibilityindex.ai

Mechanism-Aware Neural Machine for Dialogue Response Generation

Authors: Ganbin Zhou, Ping Luo, Rongyu Cao, Fen Lin, Bo Chen, Qing He

AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Finally, the experiments with human judgements, intuitive examples, detailed discussions demonstrate the quality and diversity of the generated responses with 9.80% increase of acceptable ratio over the best of six baseline methods. Quantitative Study of Response Diversity Here, we quantitatively study the issue of response diversity in the training corpus for conversation compared with the one for machine translation. Experiment Process Dataset Details To obtain the conversation corpus, we collected nearly 14 million post-response pairs from Tencent Weibo. Human Judgement Due to response diversity, it is practically impossible to establish a data set which adequately cover all the responses for given posts. Experimental Results and Analysis The experimental results are summarized in Table 1.
Researcher Affiliation	Collaboration	1Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, CAS, Beijing 100190, China. 2University of Chinese Academy of Sciences, Beijing 100049, China. 3Pattern Recognition Center, We Chat Technical Architecture Department, Tencent, China.
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper states 'All the models were implemented using Theano (Theano Development Team 2016)' but does not provide a link or explicit statement that their own implementation code for the described methodology is open source.
Open Datasets	Yes	To obtain the conversation corpus, we collected nearly 14 million post-response pairs from Tencent Weibo1. We used a public corpus for machine translation (CWMT 2013) as Dt.
Dataset Splits	Yes	Totally, we have 815, 852 pairs left, among which 775, 852 ones are for training, and 40, 000 for model validation.
Hardware Specification	No	The paper does not provide any specific hardware details such as GPU/CPU models, memory, or detailed computer specifications used for running the experiments.
Software Dependencies	No	The paper mentions 'All the models were implemented using Theano (Theano Development Team 2016)', but it does not specify a version number for Theano.
Experiment Setup	Yes	The dimension of the word embedding is set to 128 for all the models. We applied the one-layer GRU units (each with 1024 cells) to all the models in experiment. For MARM, the number of mechanisms is M = 4. The mechanism embeddings with 128 dimensions are initialized by a uniform distribution between -0.2 and 0.2. For response generation, we select top L = 2 mechanisms for beam search, the number of response candidates from each mechanism is K = 5. The beam size is 200 for all models. All the other parameters are initialized by a uniform distribution between -0.01 and 0.01. In training, we divided the corpus into mini-batches whose size is 128, and used the RMSProp (Graves 2013) algorithm for optimization.