reproducibilityindex.ai

Black-box Prompt Tuning for Vision-Language Model as a Service

Authors: Lang Yu, Qin Chen, Jiaju Lin, Liang He

IJCAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results show that our proposed black-box prompt tuning framework outperforms both hand-crafted prompt engineering and gradient-based prompt learning methods, which serves as evidence of its capability to train taskrelevant prompts in a derivative-free manner.
Researcher Affiliation	Academia	Lang Yu1,2 , Qin Chen1,2 , Jiaju Lin1 and Liang He1,2 1School of Computer Science and Technology, East China Normal University 2Shanghai Institute of AI for Education, East China Normal University {lyu, jiaju lin}@stu.ecnu.edu.cn, {qchen, lhe}@cs.ecnu.edu.cn
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	Yes	Code is available at https://github.com/Bruth YU/BPT-VLM
Open Datasets	Yes	To evaluate the effectiveness of BPT-VLM, we conduct experiments on 9 visual image classification datasets: Image Net [Deng et al., 2009], Caltech101 [Fei-Fei et al., 2004], Oxford Pets [Parkhi et al., 2012], Flowers102 [Nilsback and Zisserman, 2008], Food101 [Bossard et al., 2014], UCF101 [Soomro et al., 2012], SUN397 [Xiao et al., 2010], Euro SAT [Helber et al., 2019] and DTD [Cimpoi et al., 2014].
Dataset Splits	Yes	Following the few-shot setting adopted in [Zhou et al., 2022], all methods use the same 16-shot split for prompt tuning and are evaluated on full test-sets for comparison.
Hardware Specification	No	The paper does not explicitly describe the specific hardware (e.g., GPU models, CPU models, memory) used for running the experiments.
Software Dependencies	No	The paper mentions "Py CMA" and "Py Pop7" as open-source libraries used for implementation, but does not provide specific version numbers for these or other key software components like Python, PyTorch, or CUDA.
Experiment Setup	Yes	Table 1: Default Setting of Hyper-parameters includes Intrinsic Dimension 1000, Vision Prompt Length 8, Language Prompt Length 5, Population Size 30, and Loss Function Cross Entropy.