Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Multi-Modal Attentive Prompt Learning for Few-shot Emotion Recognition in Conversations

Authors: Xingwei Liang, Geng Tu, Jiachen Du, Ruifeng Xu

JAIR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To evaluate our proposed model s efficacy, we conducted extensive experiments on two widely recognized benchmark datasets, MELD and IEMOCAP. Our results demonstrate that the MAP framework outperforms state-of-the-art ERC models, yielding notable improvements of 3.5% and 0.4% in micro F1 scores.
Researcher Affiliation	Academia	Xingwei Liang EMAIL Geng Tu EMAIL Jiachen Du EMAIL Harbin Institute of Technology, Shenzhen, P.R.China, 518055 Ruifeng Xu EMAIL Harbin Institute of Technology, Shenzhen, P.R.China, 518055 Peng Cheng Laboratory, Shenzhen, China Guangdong Provincial Key Laboratory of Novel Security Intelligence Technologies
Pseudocode	No	The paper describes the model architecture and procedures in detail using prose and mathematical equations. It does not contain any explicitly labeled "Pseudocode" or "Algorithm" blocks, nor does it present structured steps formatted like code.
Open Source Code	No	The paper does not provide any concrete access information for source code, such as a repository link, an explicit code release statement, or mention of code in supplementary materials.
Open Datasets	Yes	To evaluate our proposed model s efficacy, we conducted extensive experiments on two widely recognized benchmark datasets, MELD and IEMOCAP. Our results demonstrate that the MAP framework outperforms state-of-the-art ERC models, yielding notable improvements of 3.5% and 0.4% in micro F1 scores. ... Datasets. MELD2 (Poria et al., 2019) and IEMOCAP3 datasets (Busso et al., 2008) are selected as our datasets. MELD contains 13,708 utterances from 1433 dialogues of Friends TV series. It annotates each utterance with one of seven emotions (anger, disgust, fear, joy, neutral, sadness or surprise). It contains a total of approximately 33 hours of dialogues. IEMOCAP is a multi-modal database of ten speakers involved in two-way dyadic conversations. ... 2. https://affective-meld.github.io/ 3. http://sail.usc.edu/iemocap/
Dataset Splits	Yes	MELD and IEMOCAP are multi-modal ERC datasets that involve all the textual, visual, and acoustic information. The dataset details are provided in Table 1. ... Table 1: Training, validation, and test data distribution in the datasets.
Hardware Specification	No	The paper mentions training time comparisons in Section 4.7 and refers to models having a certain number of parameters, but it does not specify the exact hardware (e.g., GPU models, CPU types, memory amounts) used for running the experiments.
Software Dependencies	No	The paper mentions various software components and models used, such as BERT, ResNet, VGGish, Bi GRU, Transformer, RoBERTa, EmoBERTa, and the sklearn package. However, it does not provide specific version numbers for any of these software dependencies.
Experiment Setup	Yes	Hyperparameter Setting. The textual, visual, and acoustic inputs are initialized with BERT, Res Net, and VGGish, respectively. All weight matrices are given their initial values by sampling from a uniform distribution U( 0.1, 0.1). The optimal learning rate is set to 4e-7 for MELD dataset and 6e-7 for the IEMOCAP datasets. The batch size is set to 1 and the number of epochs is set to 50 for MELD and 150 for IEMOCAP. The dropout rate is set to 0.1 for MELD and 0.5 for IEMOCAP.