CARAT: Contrastive Feature Reconstruction and Aggregation for Multi-Modal Multi-Label Emotion Recognition

Authors: Cheng Peng, Ke Chen, Lidan Shou, Gang Chen

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on two benchmark datasets CMU-MOSEI and M3ED demonstrate the effectiveness of CARAT over state-of-the-art methods.
Researcher Affiliation Academia The State Key Laboratory of Blockchain and Data Security, Zhejiang University, Hangzhou 310000, China {chengchng, chenk, should, cg}@zju.edu.cn
Pseudocode No The paper describes the CARAT framework in detail with mathematical formulations, but it does not include any pseudocode or algorithm blocks.
Open Source Code Yes Code is available at https://github.com/chengzju/CARAT.
Open Datasets Yes We evaluate CARAT on two benchmark MMER datasets (CMU-MOSEI (Zadeh et al. 2018b) and M3ED (Zhao et al. 2022)), which maintained settings in the public SDK23.
Dataset Splits Yes During training, we train methods for 20 epochs to select the model with the best F1 score on the validation set as our final model.
Hardware Specification Yes All experiments are conducted with one NVIDIA A100 GPU.
Software Dependencies No The paper mentions various algorithms and optimizers used (e.g., Transformer, Adam optimizer, SCL), but does not specify software dependencies with version numbers (e.g., Python version, specific deep learning framework versions like PyTorch or TensorFlow).
Experiment Setup Yes We set the size of hidden states as d = 256, dz = 64. The size of the embedding queue is set to 8192. All encoders Enm( ) and decoders Dem( ) are implemented by 2-layer MLPs. We set hyper-parameters γo =0.01, γα =0.1, γβ =1, γs =1, γsf =0.1 and γr =1 and the analysis of different weight settings is presented in Appendix A. We set lt = 6, lv = la = 4 for the layer number of Transformer Encoders. We employ the Adam (Kingma and Ba 2014) optimizer with the initial learning rate of 5e 5 and a liner decay learning rate schedule with a warm-up strategy. The batch size B is set to 64.