Augmenting End-to-End Dialogue Systems With Commonsense Knowledge

Authors: Tom Young, Erik Cambria, Iti Chaturvedi, Hao Zhou, Subham Biswas, Minlie Huang

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments suggest that the knowledgeaugmented models are superior to their knowledge-free counterparts. The main results for TF-IDF, word embeddings, memory networks and LSTM models are summarized in Table 1.
Researcher Affiliation Academia 1School of Information and Electronics, Beijing Institute of Technology, China 2School of Computer Science and Engineering, Nanyang Technological University, Singapore 3Department of Computer Science and Technology, Tsinghua University, China
Pseudocode No The paper provides model descriptions and mathematical formulations but does not include any pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets Yes Concept Net can be Downloaded at http://github.com/commonsense/conceptnet5/wiki/Downloads. We gratefully acknowledge the help of Alan Ritter for sharing the twitter dialogue dataset
Dataset Splits Yes 1M Twitter <message, response> pairs are used for training. ... For tuning and evaluation, we use 20K <message, response> pairs that constitute the validation set (10K) and test set (10K).
Hardware Specification No The paper mentions 'NTU PDCC center for providing computing resources' but does not specify any hardware details such as GPU/CPU models or processor types.
Software Dependencies No The paper mentions software components like GloVe and LSTM models and optimization methods like SGD, but does not provide specific version numbers for any software dependencies.
Experiment Setup Yes The size of hidden units in LSTM models is set to 256 and the word embedding dimension is 100. We use stochastic gradient descent (SGD) for optimizing with batch size of 64. We fixed training rate at 0.001.