Tailor Versatile Multi-Modal Learning for Multi-Label Emotion Recognition

Authors: Yi Zhang, Mingyuan Chen, Jundong Shen, Chongjun Wang9100-9108

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In addition, we conduct experiments on the benchmark MMER dataset CMU-MOSEI in both aligned and unaligned settings, which demonstrate the superiority of TAILOR over the state-of-the-arts.In this section, we give empirically evaluations and analysis of our proposed TAILOR
Researcher Affiliation Academia State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing 210023, China {njuzhangy, mychen, jdshen}@smail.nju.edu.cn, {chjwang}@nju.edu.cn
Pseudocode No The paper includes mathematical equations, but it does not present any pseudocode or algorithm blocks with structured steps formatted like code.
Open Source Code Yes 2https://github.com/kniter1/TAILOR
Open Datasets Yes We conduct experiments on benchmark multimodal multi-label dataset CMU-MOSEI (Zadeh et al. 2018c)
Dataset Splits No Table 1 summarizes details of CMU-MOSEI in both word-aligned and unaligned settings. While the paper mentions using CMU-MOSEI, a benchmark dataset, it does not explicitly provide the specific training/validation/test splits (e.g., percentages or sample counts) used for reproducibility. It only lists modality dimensions and sequence lengths.
Hardware Specification Yes All experiments are running with one GTX 1080Ti GPU.
Software Dependencies No The paper mentions that parameters are optimized by Adam (Kingma and Ba 2015), but it does not provide specific version numbers for any software components, libraries, or programming languages used.
Experiment Setup Yes We set hyper-parameters α = 0.01, β = 5e 6 and γ = 0.5. The batch size is 64. For layer number in Transformer Encoder, we set nv = na = 4, nt = 6 in uni-modal encoders, nc = 3 in cross-modal encoders. The size of hidden layers in encoders and decoder is d = 256, the head number hl = hm = 8. All parameters in TAILOR are optimized by Adam (Kingma and Ba 2015) with an initial learning rate of 1e 5 for aligned setting, 1e 4 for unaligned setting and employ a liner decay learning rate schedule with a warm-up strategy.