reproducibilityindex.ai

Coupled Mamba: Enhanced Multimodal Fusion with Coupled State Space Model

Authors: Wenbing Li, Hang Zhou, Junqing Yu, Zikai Song, Wei Yang

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on CMU-MOSEI, CH-SIMS, CH-SIMSV2, BRCA, MM-IMDB through multi-domain input verify the effectiveness of our model compared to current state-of-the-art methods, improved F1-Score by 0.4%, 0.9%, and 2.3% on the CMU-MOSEI, CH-SIMS and CH-SIMSV2 datasetes respectively, 49% faster inference and 83.7% GPU memory save.
Researcher Affiliation	Academia	Wenbing Li Hang Zhou Junqing Yu Zikai Song Wei Yang Huazhong University of Science and Technology {wenbingli, henrryzh, yjqing, skyesong, weiyangcs}@hust.edu.cn
Pseudocode	Yes	Algorithm 1: Coupled Mamba
Open Source Code	Yes	Code is available at https://github.com/hustcselwb/coupledmamba.
Open Datasets	Yes	We conduct experiments on five benchmark datasets (CMU-MOSEI, CH-SIMS [24], CHSIMSV2 [25], MM-IMDB and BRCA).
Dataset Splits	Yes	CMU-MOSEI dataset is an extension of CMU-MOSI, contains 22856 samples of movie review video clips. In this dataset, 16326 samples are used as the training set, and the remaining 1871 and 4659 samples are used as the validation set and test set respectively.
Hardware Specification	Yes	All experiments were conducted on a Linux workstation equipped with a single NVIDIA 32GB V100GPU and a 32-core Intel Xeon CPU.
Software Dependencies	Yes	The environment we use is python 3.10, cuda12.1, torch 2.12.
Experiment Setup	Yes	We use a hidden dimension size of 128, an expansion coefficient of 2, a convolution kernel size of 4, = dstate/8 as the configuration of each Mamba block, and a layer number of 3 to train our Coupled Mamba. We use Adam to optimize the model and set the learning rate to 0.0005 , weight decay coefficient is 0.0005, epoch is 150, the batch size is set to 1024, 128, 256 on CMU-MOSEI, CH-SIMS, and CH-SIMSV2. L1 loss is used as the loss function for the regression task, and cross entropy is used as the loss function for the classification task.