CARAT: Contrastive Feature Reconstruction and Aggregation for Multi-Modal Multi-Label Emotion Recognition
Authors: Cheng Peng, Ke Chen, Lidan Shou, Gang Chen
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on two benchmark datasets CMU-MOSEI and M3ED demonstrate the effectiveness of CARAT over state-of-the-art methods. |
| Researcher Affiliation | Academia | The State Key Laboratory of Blockchain and Data Security, Zhejiang University, Hangzhou 310000, China {chengchng, chenk, should, cg}@zju.edu.cn |
| Pseudocode | No | The paper describes the CARAT framework in detail with mathematical formulations, but it does not include any pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code is available at https://github.com/chengzju/CARAT. |
| Open Datasets | Yes | We evaluate CARAT on two benchmark MMER datasets (CMU-MOSEI (Zadeh et al. 2018b) and M3ED (Zhao et al. 2022)), which maintained settings in the public SDK23. |
| Dataset Splits | Yes | During training, we train methods for 20 epochs to select the model with the best F1 score on the validation set as our final model. |
| Hardware Specification | Yes | All experiments are conducted with one NVIDIA A100 GPU. |
| Software Dependencies | No | The paper mentions various algorithms and optimizers used (e.g., Transformer, Adam optimizer, SCL), but does not specify software dependencies with version numbers (e.g., Python version, specific deep learning framework versions like PyTorch or TensorFlow). |
| Experiment Setup | Yes | We set the size of hidden states as d = 256, dz = 64. The size of the embedding queue is set to 8192. All encoders Enm( ) and decoders Dem( ) are implemented by 2-layer MLPs. We set hyper-parameters γo =0.01, γα =0.1, γβ =1, γs =1, γsf =0.1 and γr =1 and the analysis of different weight settings is presented in Appendix A. We set lt = 6, lv = la = 4 for the layer number of Transformer Encoders. We employ the Adam (Kingma and Ba 2014) optimizer with the initial learning rate of 5e 5 and a liner decay learning rate schedule with a warm-up strategy. The batch size B is set to 64. |