Enhancing Cross-Modal Fine-Tuning with Gradually Intermediate Modality Generation

Authors: Lincan Cai, Shuang Li, Wenxuan Ma, Jingxuan Kang, Binhui Xie, Zixun Sun, Chengwei Zhu

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Compared with hand-designed, general-purpose, task-specific, and state-of-the-art cross-modal fine-tuning approaches, Pa Re demonstrates superior performance across three challenging benchmarks, encompassing more than ten modalities.
Researcher Affiliation Collaboration 1Beijing Institute of Technology 2University of Illinois Urbana Champaign 3Interactive Entertainment Group, Tencent.
Pseudocode Yes We summarize our Pa Re in Alg. 1 in the Appendix A.1.
Open Source Code No The paper does not contain any explicit statement about releasing code or a link to a code repository.
Open Datasets Yes For 2D classification tasks, CIFAR10 (Krizhevsky et al., 2009) and Tiny-Image Net (Le & Yang, 2015) serve as proxy datasets. For 2D dense prediction tasks, we use VOC (Everingham et al., 2015) as a proxy dataset... For 1D tasks, Co NLL-2003 is employed as a proxy dataset. We validate Pa Re for cross-modal fine-tuning on three benchmarks: NASBench-360, PDEBench and Open ML-CC18, comprising a total of 48 datasets.
Dataset Splits No The paper mentions training and test sets but does not explicitly mention validation sets or their splits. For example, "The train-test split ratio is 0.5:0.5".
Hardware Specification Yes Our experiments are conducted in a single NVIDIA RTX 4090.
Software Dependencies No We follow ORCA (Shen et al., 2023) use the Hugging Face transformers library (Wolf et al., 2019) to implement the pretrained models.
Experiment Setup Yes For other experimental settings such as learning rates, number of epochs, optimizers, we adhere to the configurations specified by ORCA. Our experiments are conducted in a single NVIDIA RTX 4090. The specific parameter settings are shown in the Tabel 12 and Table 13.