CEM: Commonsense-Aware Empathetic Response Generation
Authors: Sahand Sabour, Chujie Zheng, Minlie Huang11229-11237
AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our approach on EMPATHETICDIALOGUES, which is a widely-used benchmark dataset for empathetic response generation. Empirical results demonstrate that our approach outperforms the baseline models in both automatic and human evaluations and can generate more informative and empathetic responses. Also includes sections like 'Experiments Baselines', 'Automatic Evaluation', 'Human Evaluation', 'Ablation Studies'. |
| Researcher Affiliation | Academia | The Co AI Group, DCST, Institute for Artificial Intelligence, State Key Lab of Intelligent Technology and Systems, Beijing National Research Center for Information Science and Technology, Tsinghua University, Beijing 100084, China |
| Pseudocode | No | The paper describes the model architecture and processes in text and diagrams, but does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code is available at https://github.com/Sahandfer/CEM. |
| Open Datasets | Yes | We conduct our experiments on the EMPATHETICDIALOGUES (Rashkin et al. 2019), a large-scale multi-turn dataset containing 25k empathetic conversations between crowdsourcing workers. |
| Dataset Splits | Yes | We used the same 8:1:1 train/valid/test split as provided by Rashkin et al. (2019). |
| Hardware Specification | Yes | All the models were trained on one single TITAN Xp GPU |
| Software Dependencies | No | The paper mentions 'Py Torch' and other components like 'Glo VE vectors' and 'Adam optimizer' but does not provide specific version numbers for any software dependencies. |
| Experiment Setup | Yes | We used 300-dimensional pre-trained Glo VE vectors... The hidden dimension for all corresponding components were set to 300. Adam (Kingma and Ba 2017) optimizer with β1 = 0.9 and β2 = 0.98 was used for training. The initial learning rate was set to 0.0001 and we varied this value during training according to Vaswani et al. (2017). All the models were trained on one single TITAN Xp GPU using a batch size of 16 and early stopping. In our experiments, we set γ1 = 1, γ2 = 1, and γ3 = 1.5. |