reproducibilityindex.ai

Provable Dynamic Fusion for Low-Quality Multimodal Data

Authors: Qingyang Zhang, Haitao Wu, Changqing Zhang, Qinghua Hu, Huazhu Fu, Joey Tianyi Zhou, Xi Peng

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experimental results on multiple benchmarks can support our findings. In this section, we conduct experiments on multiple datasets of diverse applications.
Researcher Affiliation	Academia	1College of Intelligence and Computing, Tianjin University, Tianjin, China 2Tianjin Key Lab of Machine Learning, Tianjin University, Tianjin, China 3Institute of High Performance Computing (IHPC), Agency for Science, Technology and Research (ASTAR), Singapore 4Centre for Frontier AI Research (CFAR), Agency for Science, Technology and Research (ASTAR), Singapore 5College of Computer Science, Sichuan University, Chengdu, China.
Pseudocode	Yes	Algorithm 1 Training Pseudo Code of Quality-aware Multimodal Fusion (QMF)
Open Source Code	Yes	Code is available at https://github.com/Qingyang Zhang/QMF.
Open Datasets	Yes	We evaluate our method on two multimodal classification tasks. Scenes Recognition: NYU Depth V2 (Silberman et al., 2012) and SUN RGB-D (Song et al., 2015) are two public indoor scenes recognition datasets, which are associated with two modalities, i.e., RGB and depth images. Image-text classification: The UPMC FOOD101 dataset (Wang et al., 2015) contains (possibly noisy) images obtained by Google Image Search and corresponding textual descriptions. MVSA sentiment analysis dataset (Niu et al., 2016) includes a set of image-text pairs with manual annotations collected from social media.
Dataset Splits	Yes	For FOOD-101, following the previous work (Kiela et al., 2019), there are 60101 image-text pairs in the training set, 5000 image-text pairs in the validation set, and 21695 image-text pairs in the test set. The validation set contains 518 image-text pairs, and the test set contains 519 image-text pairs.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments.
Software Dependencies	No	The paper mentions software components like 'Adam optimizer', 'Res Net', 'Bert', and 'Mind Spore', but does not provide specific version numbers for these software dependencies, which are required for reproducibility.
Experiment Setup	Yes	The learning rate is 1e-4 and the dropout rate is 0.1. The learning rate is 1e-4 with a warmup rate of 0.1. The hyperparameter λ is set to 0.1. Temperature parameters {T m}M m=1 are set to 1. We adopt the early stop strategy based on validation accuracy.