CEM: Commonsense-Aware Empathetic Response Generation

Authors: Sahand Sabour, Chujie Zheng, Minlie Huang11229-11237

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate our approach on EMPATHETICDIALOGUES, which is a widely-used benchmark dataset for empathetic response generation. Empirical results demonstrate that our approach outperforms the baseline models in both automatic and human evaluations and can generate more informative and empathetic responses. Also includes sections like 'Experiments Baselines', 'Automatic Evaluation', 'Human Evaluation', 'Ablation Studies'.
Researcher Affiliation Academia The Co AI Group, DCST, Institute for Artificial Intelligence, State Key Lab of Intelligent Technology and Systems, Beijing National Research Center for Information Science and Technology, Tsinghua University, Beijing 100084, China
Pseudocode No The paper describes the model architecture and processes in text and diagrams, but does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes Our code is available at https://github.com/Sahandfer/CEM.
Open Datasets Yes We conduct our experiments on the EMPATHETICDIALOGUES (Rashkin et al. 2019), a large-scale multi-turn dataset containing 25k empathetic conversations between crowdsourcing workers.
Dataset Splits Yes We used the same 8:1:1 train/valid/test split as provided by Rashkin et al. (2019).
Hardware Specification Yes All the models were trained on one single TITAN Xp GPU
Software Dependencies No The paper mentions 'Py Torch' and other components like 'Glo VE vectors' and 'Adam optimizer' but does not provide specific version numbers for any software dependencies.
Experiment Setup Yes We used 300-dimensional pre-trained Glo VE vectors... The hidden dimension for all corresponding components were set to 300. Adam (Kingma and Ba 2017) optimizer with β1 = 0.9 and β2 = 0.98 was used for training. The initial learning rate was set to 0.0001 and we varied this value during training according to Vaswani et al. (2017). All the models were trained on one single TITAN Xp GPU using a batch size of 16 and early stopping. In our experiments, we set γ1 = 1, γ2 = 1, and γ3 = 1.5.