Augmenting End-to-End Dialogue Systems With Commonsense Knowledge
Authors: Tom Young, Erik Cambria, Iti Chaturvedi, Hao Zhou, Subham Biswas, Minlie Huang
AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments suggest that the knowledgeaugmented models are superior to their knowledge-free counterparts. The main results for TF-IDF, word embeddings, memory networks and LSTM models are summarized in Table 1. |
| Researcher Affiliation | Academia | 1School of Information and Electronics, Beijing Institute of Technology, China 2School of Computer Science and Engineering, Nanyang Technological University, Singapore 3Department of Computer Science and Technology, Tsinghua University, China |
| Pseudocode | No | The paper provides model descriptions and mathematical formulations but does not include any pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | Concept Net can be Downloaded at http://github.com/commonsense/conceptnet5/wiki/Downloads. We gratefully acknowledge the help of Alan Ritter for sharing the twitter dialogue dataset |
| Dataset Splits | Yes | 1M Twitter <message, response> pairs are used for training. ... For tuning and evaluation, we use 20K <message, response> pairs that constitute the validation set (10K) and test set (10K). |
| Hardware Specification | No | The paper mentions 'NTU PDCC center for providing computing resources' but does not specify any hardware details such as GPU/CPU models or processor types. |
| Software Dependencies | No | The paper mentions software components like GloVe and LSTM models and optimization methods like SGD, but does not provide specific version numbers for any software dependencies. |
| Experiment Setup | Yes | The size of hidden units in LSTM models is set to 256 and the word embedding dimension is 100. We use stochastic gradient descent (SGD) for optimizing with batch size of 64. We fixed training rate at 0.001. |