Emotional Chatting Machine: Emotional Conversation Generation with Internal and External Memory

Authors: Hao Zhou, Minlie Huang, Tianyang Zhang, Xiaoyan Zhu, Bing Liu

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments show that the proposed model can generate responses appropriate not only in content but also in emotion.
Researcher Affiliation Academia State Key Laboratory of Intelligent Technology and Systems, National Laboratory for Information Science and Technology, Dept. of Computer Science and Technology, Tsinghua University, Beijing 100084, PR China Dept. of Computer Science, University of Illinois at Chicago, Chicago, Illinois, USA
Pseudocode No The paper describes the model architecture and mathematical formulations but does not contain explicit pseudocode or algorithm blocks.
Open Source Code Yes We used Tensorflow4 to implement the proposed model5. 5https://github.com/tuxchow/ecm
Open Datasets Yes Since there is no off-the-shelf data to train ECM, we firstly trained an emotion classifier using the NLPCC emotion classification dataset and then used the classifier to annotate the STC conversation dataset (Shang, Lu, and Li 2015) to construct our own experiment dataset.
Dataset Splits Yes We then partitioned the NLPCC dataset into training, validation, and test sets with the ratio of 8:1:1.
Hardware Specification Yes We ran 20 epoches, and the training stage of each model took about a week on a Titan X GPU machine.
Software Dependencies No The paper mentions using 'Tensorflow' for implementation but does not specify a version number for Tensorflow or any other software dependencies.
Experiment Setup Yes The encoder and decoder have 2-layer GRU structures with 256 hidden cells for each layer and use different sets of parameters respectively. The word embedding size is set to 100. The vocabulary size is limited to 40,000. The embedding size of emotion category is set to 100. The internal memory is a trainable matrix of size 6 256 and the external memory is a list of 40,000 words containing generic words and emotion words (but emotion words have different markers). To generate diverse responses, we adopted beam search in the decoding process of which the beam size is set to 20, and then reranked responses by the generation probability after removing those containing UNKs, unknown words. We used the stochastic gradient descent (SGD) algorithm with mini-batch. Batch size and learning rate are set to 128 and 0.5, respectively.