reproducibilityindex.ai

DialogXL: All-in-One XLNet for Multi-Party Conversation Emotion Recognition

Authors: Weizhou Shen, Junqing Chen, Xiaojun Quan, Zhixian Xie13789-13797

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments are conducted on four ERC benchmarks with mainstream models presented for comparison. The experimental results show that the proposed model outperforms the baselines on all the datasets. Several other experiments such as ablation study and error analysis are also conducted and the results conﬁrm the role of the critical modules of Dialog XL.
Researcher Affiliation	Academia	Weizhou Shen, Junqing Chen, Xiaojun Quan*, Zhixian Xie Sun Yat-sen University, China {shenwzh3, chenjq95, xiezhx23}@mail2.sysu.edu.cn, quanxj3@mail.sysu.edu.cn
Pseudocode	No	The paper describes the model architecture and mathematical formulations, but does not include structured pseudocode or an algorithm block.
Open Source Code	Yes	The implementation is available at https://github.com/shenwzh3/Dialog XL.
Open Datasets	Yes	We evaluate Dialog XL on four multi-turn multi-party ERC datasets. The statistics of them are shown in Table 1. IEMOCAP (Busso et al. 2008): A multimodal conversational dataset for emotion recognition... MELD (Poria et al. 2019): A multimodal dataset for emotion recognition collected from the TV show Friends... Daily Dialog (Li et al. 2017): Human-written daily communications... Emory NLP (Zahiri and Choi 2017): TV show scripts collected from Friends...
Dataset Splits	Yes	The statistics of them are shown in Table 1. IEMOCAP (Busso et al. 2008): ... Since this dataset has no validation set, we follow (Zhong, Wang, and Miao 2019) to use the last 20 dialogues in the training set for validation.
Hardware Specification	No	The paper does not explicitly describe the hardware used for experiments, such as specific GPU models, CPU types, or memory.
Software Dependencies	No	The paper does not provide specific software dependency details with version numbers (e.g., Python, PyTorch, TensorFlow, CUDA versions).
Experiment Setup	Yes	Hyperparameter tuning for each dataset is conducted with hold-out validation on the validation set. The tunable hyperparameters include learning rate, number of heads for the four types of attentions in dialog-aware self-attention3, the max length of memory4, and the dropout rate.