Dynamic Knowledge Routing Network for Target-Guided Open-Domain Conversation

Authors: Jinghui Qin, Zheng Ye, Jianheng Tang, Xiaodan Liang8657-8664

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on two target-guided open-domain conversation datasets show the superiority of our approach, which significantly surpasses state-of-the-art methods in keyword prediction accuracy, retrieval accuracy, and success rate of conversation under automatic metrics as well as human evaluation.
Researcher Affiliation Collaboration Jinghui Qin,1 Zheng Ye,1 Jianheng Tang,1 Xiaodan Liang1,2 1Sun Yat-Sen University, 2Dark Matter AI Inc. {qinjingh, yezh7}@mail2.sysu.edu.cn, {sqrt3tjh, xdliang328}@gmail.com
Pseudocode No The paper describes the model architecture and processes in narrative text and through a diagram (Figure 1), but it does not include a formal pseudocode or algorithm block.
Open Source Code No The paper does not contain any explicit statement about releasing source code or a link to a code repository for the methodology described.
Open Datasets Yes To push the research boundary of the task to match real-world scenarios better, we construct a new Weibo Conversation Dataset for target-guide open-domain conversation, denoted as CWC. Our dataset is derived from a public multi-turn conversation corpus1 crawled from Sina Weibo, which is one of the most popular social platforms of China. [footnote 1: http://tcci.ccf.org.cn/conference/2018/dldoc/trainingdata05. zip]
Dataset Splits Yes We split our data set randomly into three parts: train set (90%), validation set (5%), and test set (5%).
Hardware Specification No The paper mentions using GRU, GloVe, Word2Vec, and ADAM optimization, but does not specify any hardware details like CPU or GPU models, memory, or specific computing environments used for experiments.
Software Dependencies No The paper mentions using 'GloVe', 'Baidu Encyclopedia Word2Vec', 'ADAM optimization (Kingma and Ba 2015)', 'Texar (Hu et al. 2019)', and 'Dial Crowd toolkit (Lee et al. 2018)', but it does not specify concrete version numbers for these software components.
Experiment Setup Yes For TGPC and CWC, we both apply a single-layer GRU (Chung et al. 2014) in our encoder. For TGPC, both the word embedding and hidden dimensions are set to 200. GloVe is used to initialize word embeddings. For CWC, we set the word embedding and hidden dimensions as 300. Baidu Encyclopedia Word2Vec (Li et al. 2018b) is used to initialize word embeddings. The other hyper-parameters for both datasets are the same. We apply ADAM optimization (Kingma and Ba 2015) with an initial learning rate of 0.001 and decay to 0.0001 in 10 epochs.