Keyword-Guided Neural Conversational Model
Authors: Peixiang Zhong, Yong Liu, Hao Wang, Chunyan Miao14568-14576
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Automatic evaluations suggest that commonsense improves the performance of both next-turn keyword prediction and keyword-augmented response retrieval. In addition, both self-play and human evaluations show that our model produces responses with smoother keyword transition and reaches the target keyword faster than competitive baselines. |
| Researcher Affiliation | Collaboration | 1 Alibaba-NTU Singapore Joint Research Institute, Nanyang Technological University (NTU), Singapore 2 Joint NTU-UBC Research Centre of Excellence in Active Living for the Elderly, NTU, Singapore 3 Alibaba Group, China |
| Pseudocode | No | The paper describes the model architecture and approach using textual descriptions and diagrams (Figure 3, Figure 4), but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | No explicit statement or link is provided indicating that the source code for the methodology described in this paper is publicly available. |
| Open Datasets | Yes | We use the Conv AI2 dataset proposed in (Zhang et al. 2018a; Dinan et al. 2019a) and preprocessed in (Tang et al. 2019) in our experiments. In addition, we collect a large-scale open-domain conversation dataset from the social media Reddit5. The proposed Reddit dataset is collected from casual chats on the Casual Conversation6 and Casual UK7 subreddits, where users chat freely with each other in any topic. Reddit is significantly larger and more diverse than Conv AI2. 5https://www.reddit.com/. We use the Pushshift dataset on Google Big Query. |
| Dataset Splits | Yes | Dataset Split #Conv. #Utter. #Key. Avg. #Key. Conv AI2 Train 8950 132601 2678 1.78 Valid 485 7244 2069 1.79 Test 500 7194 1571 1.50 Reddit Train 112693 461810 2931 2.27 Valid 6192 25899 2851 2.25 Test 5999 24108 2846 2.30 |
| Hardware Specification | No | No specific hardware details (e.g., GPU/CPU models, memory specifications) used for running experiments are provided in the paper. |
| Software Dependencies | No | No specific software versions for dependencies are provided. The paper mentions using 'Glo Ve embedding' and optimizing with 'Adam' but without version numbers for these or other software/libraries. |
| Experiment Setup | Yes | All hidden sizes in GRU and GGNN are set to 200. We use one layer in GGNN and set λk = 0.01. We optimize our model using Adam (Kingma and Ba 2014) with batch size of 32, an initial learning rate of 0.001 and a decay rate of 0.9 for every epoch. |