Improving Open-Domain Dialogue Response Generation with Multi-Source Multilingual Commonsense Knowledge
Authors: Sixing Wu, Jiong Yu, Jiahao Chen, Xiaofan Deng, Wei Zhou
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments have verified the effectiveness of our dataset and approach in monolingual, cross-lingual, and multilingual scenarios. |
| Researcher Affiliation | Academia | 1National Pilot School of Software, Yunnan University, Kunming, China 2Engineering Research Center of Cyberspace, Yunnan University, Kunming, China |
| Pseudocode | No | The paper describes the Estimate-Cluster-Penalize mechanism and its steps with equations, but it is presented as a textual description within paragraphs and mathematical formulas, not as a structured pseudocode block or algorithm figure. |
| Open Source Code | Yes | More details of our LLM instruction prompt pattern and some sampled dialogue cases can be found in our Git Hub project https://github.com/YNLP/Chatbots/tree/main/AAAI2024 MMK-BART. |
| Open Datasets | No | The paper states it constructs MMK-Daily Dialog by extending XDaily Dialog (Liu et al. 2023) and aligning Concept Net (Speer, Chin, and Havasi 2016), both of which are cited. However, it does not provide a specific link, DOI, or repository for the constructed MMK-Daily Dialog dataset itself. |
| Dataset Splits | Yes | Table 1: The statistics of our MMK-Daily Dialog. #Training 10.5K Sessions and 39.7K Dialogues #Valid/Test 995/996 Sessions and 3.83K/3.69K Dialogues |
| Hardware Specification | Yes | Depending on the 24GB V-GRAM of the Nvidia RTX-3090 GPU and the input length, the gradient acclimation step is set to either 4 (when there are 20*4 facts) or 2 (in other scenarios). |
| Software Dependencies | No | All methods are implemented with Py Torch and Huggingface library. While the software names are mentioned, specific version numbers are not provided. |
| Experiment Setup | Yes | we use the mini-batch of 32 in fine-tuning. Depending on the 24GB V-GRAM of the Nvidia RTX-3090 GPU and the input length, the gradient acclimation step is set to either 4 (when there are 20*4 facts) or 2 (in other scenarios). We use the Adam optimizer and 500 warming steps. We search the learning rate and the epoch number based on the English subset. For m T5, we set the learning rate to 3e-4 and train 5 epochs. For the m BART(MMKBART), we set the learning rate to 2.5e-5 and train 3 epochs. In the inference, we select the last epoch and use the beam width of 5. For our MMK-BART, ECP clusters facts into 10 groups and sets the penalty factor λ to 0.99 by default. |