reproducibilityindex.ai

$\pi$-Tuning: Transferring Multimodal Foundation Models with Optimal Multi-task Interpolation

Authors: Chengyue Wu, Teng Wang, Yixiao Ge, Zeyu Lu, Ruisong Zhou, Ying Shan, Ping Luo

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirically, we conduct extensive experiments on vision, language, and vision-language downstream tasks to demonstrate the superiority, generalization, and scalability of our approach.
Researcher Affiliation	Collaboration	1Department of Computer Science, The University of Hong Kong 2ARC Lab, Tencent PCG 3Shanghai Jiao Tong University 4School of Mathematical Sciences, Fudan University.
Pseudocode	No	The paper describes the method and formulas but does not provide pseudocode or an algorithm block.
Open Source Code	Yes	The code will be available at https: //github.com/Tencent ARC/pi-Tuning.
Open Datasets	Yes	For language tasks, we evaluate the model on 8 tasks from GLUE (Wang et al., 2018). For VL tasks, we experiment on both understanding and generation tasks, including Ref COCO, Ref COCO+ (Yu et al., 2016), Ref COCOg (Mao et al., 2016), VQAv2 (Goyal et al., 2017), SNLI-VE (Xie et al., 2019) and COCO image captioning (Chen et al., 2015).
Dataset Splits	Yes	We report the standard metric ACC@0.5 on the validation and test sets.
Hardware Specification	Yes	All of the experiments are conducted in A100 40G and V100 32G.
Software Dependencies	No	The paper mentions software components like OFA and T5 models, but does not specify their version numbers or versions for other ancillary software like Python, PyTorch, or CUDA.
Experiment Setup	Yes	Epochs are set to 100, dropout is set to 0.1, warmup rate is set to 0.06, and label smoothing rate is set to 0.1. For prompt tuning, we follow the experimental setting used in Yang et al. (2022), where the batch size is set to 128, the learning rate is set to 0.03, and the prompt length is set to 100.