Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Beyond Single Emotion: Multi-label Approach to Conversational Emotion Recognition

Authors: Yujin Kang, Yoon-Sik Cho

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The empirical results on existing task with single label support the efficacy of our approach, which is more effective in the most challenging settings: emotion shift or confusing labels. We also evaluate ML-ERC with the multi-labels we produced to support our contrastive learning scheme. We conduct extensive experiments to verify the effectiveness of our proposed model. We integrate our multi-label scheme into existing single label ERC models, and show how our objective improves all of the existing baseline models.
Researcher Affiliation	Academia	Yujin Kang, Yoon-Sik Cho Department of Artificial Intelligence, Chung-Ang University, Republic of Korea EMAIL
Pseudocode	Yes	Algorithm 1: Learning procedure of ML-ERC for each batch B at each epoch. Once the model runs several iterations, we conduct soft-labeling.
Open Source Code	No	The paper does not provide an explicit statement or link to the source code for the methodology described. It only mentions that baseline results were reproduced using 'original code' or 'respective official codebase', referring to other works.
Open Datasets	Yes	We conduct experiments on three benchmark ERC datasets annotated with single labels. Emory NLP (Zahiri and Choi 2018) is labeled with joyful, mad, neutral, peaceful, powerful, scared, and sad from the Feeling wheel (Willcox 1982). MELD (Poria et al. 2019) is a multi-modal dataset with a label set that includes anger, disgust, fear, joy, neutral, surprise, and sadness. IEMOCAP (Busso et al. 2008) is a dyadic multimodal dataset with labels including excited, neutral, frustrated, sadness, happiness, and anger.
Dataset Splits	No	The paper mentions that 'The statistics for each dataset are provided in Table S4 in Appendix F.' which is not in the main text. It does not explicitly state the training, validation, or test splits (e.g., percentages or counts) for the datasets used in the main body of the paper.
Hardware Specification	Yes	All experiments are performed on an Nvidia RTX A6000 GPU.
Software Dependencies	No	The paper mentions using 'Ro BERTa-Large' as an embedding module but does not provide specific version numbers for any software libraries, frameworks, or programming languages used in the implementation.
Experiment Setup	Yes	To train our ML-ERC method, we set the learning rate, the number of batch sizes and epochs are 1e6, 16 and 30, respectively. We fix τ in Eq 8, 9 to 0.05. For α in Eq 12, we search the parameter using the validation set. We set α to 0.7 for Emorynlp, 0.1 for MELD, and 0.4 for IEMOCAP. The hyperparameter β, which is set to 0.5, controls the integration of LML-ERC with the original loss from the ERC model (LERC).