Neuro-Inspired Information-Theoretic Hierarchical Perception for Multimodal Learning
Authors: Xiongye Xiao, Gengshuo Liu, Gaurav Gupta, Defu Cao, Shixuan Li, Yaxing Li, Tianqing Fang, Mingxi Cheng, Paul Bogdan
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental evaluations on the MUSt ARD, CMU-MOSI, and CMU-MOSEI datasets demonstrate that our model consistently distills crucial information in multimodal learning scenarios, outperforming state-of-the-art benchmarks. |
| Researcher Affiliation | Academia | 1University of Southern California, Los Angeles, CA 90089, USA 2Hong Kong University of Science and Technology, Hong Kong, China |
| Pseudocode | Yes | The pseudo-code of the ITHP algorithm is provided in Appendix D. |
| Open Source Code | Yes | Our codebase can be found in [https://github.com/joshuaxiao98/ITHP]. |
| Open Datasets | Yes | In this section, we evaluate our proposed Information-Theoretic Hierarchical Perception (ITHP) model on three popular multimodal datasets: the Multimodal Sarcasm Detection Dataset (MUSt ARD; Castro et al., 2019), the Multimodal Opinion-level Sentiment Intensity dataset (MOSI; Zadeh et al., 2016), and the Multimodal Opinion Sentiment and Emotion Intensity (CMU-MOSEI; Zadeh et al., 2018d). |
| Dataset Splits | Yes | The evaluation is performed using a 5-fold cross-validation approach to ensure robustness and reliability. |
| Hardware Specification | Yes | All experiments were conducted on Nvidia A100 40GB GPUs. |
| Software Dependencies | No | The paper mentions using assets from BERT and DeBERTa and provides links to their general GitHub repositories, but it does not specify the version numbers for specific software libraries (e.g., Python, PyTorch, TensorFlow) used for their own implementation. |
| Experiment Setup | Yes | For the task of sarcasm detection, unless otherwise specified, we set the hyperparameters as follows: β = 32, γ = 8, λ = 1. We perform a 5-fold cross-validation, and for each experiment, we train the ITHP model for 200 epochs using an Adam optimizer with a learning rate of 10^-3. For the task of sentiment analysis, unless otherwise specified, we set the hyperparameters as follows: β = 8, γ = 32, λ = 1. We run each experiment for 40 epochs using an Adam optimizer with a learning rate of 10^-5. |