CARE: Commonsense-Aware Emotional Response Generation with Latent Concepts

Authors: Peixiang Zhong, Di Wang, Pengfei Li, Chen Zhang, Hao Wang, Chunyan Miao14577-14585

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on two large-scale datasets support our hypothesis and show that our model can produce more accurate and commonsense-aware emotional responses and achieve better human ratings than state-of-the-art models that only specialize in one aspect.
Researcher Affiliation Collaboration 1 Alibaba-NTU Singapore Joint Research Institute, Nanyang Technological University (NTU), Singapore 2 Joint NTU-UBC Research Centre of Excellence in Active Living for the Elderly, NTU, Singapore 3 School of Electrical and Electronic Engineering, NTU, Singapore 4 Alibaba Group, China
Pseudocode No The paper describes algorithms and methods but does not present them in a formal "Pseudocode" or "Algorithm" block.
Open Source Code No The paper does not contain an explicit statement about releasing source code or a link to a repository for the CARE model.
Open Datasets Yes We conduct experiments on two large-scale datasets, namely Reddit and Twitter. ... we use the emotional tweets (Mohammad 2012; Mohammad et al. 2018) to train the classifier. ... We use Concept Net (Speer, Chin, and Havasi 2017) as our CKG.
Dataset Splits Yes The statistics of the annotated datasets are presented in Table 3. ... Validation Total 49K (Reddit) / 50K (Twitter)
Hardware Specification No The paper does not provide specific hardware details such as GPU/CPU models, memory, or detailed cloud instance names used for running experiments.
Software Dependencies No The paper mentions using Adam, TransE, Transformer, and GloVe embeddings but does not provide specific version numbers for these or other software libraries/environments (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes Our Trans E embeddings have a dimension of 100... Our Transformer model has 1 layer and 4 attention heads... We initialize the word embedding layer with pre-trained Glo Ve embeddings... of size 300. The emotion embedding and feedforward layers have sizes of 50 and 512, respectively. We train our model using Adam... with learning rate of 1, batch size of 64, and dropout of 0.1 for 80K steps, including 6K steps for warmup. We empirically construct 30 relational latent concepts and 10 emotional latent concepts... We use label smoothing of 0.1, total smoothing value of 0.08 for latent concepts in DLS, and top-10 decoding with γ = 1 in CATD.