Black-box Prompt Tuning for Vision-Language Model as a Service
Authors: Lang Yu, Qin Chen, Jiaju Lin, Liang He
IJCAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results show that our proposed black-box prompt tuning framework outperforms both hand-crafted prompt engineering and gradient-based prompt learning methods, which serves as evidence of its capability to train taskrelevant prompts in a derivative-free manner. |
| Researcher Affiliation | Academia | Lang Yu1,2 , Qin Chen1,2 , Jiaju Lin1 and Liang He1,2 1School of Computer Science and Technology, East China Normal University 2Shanghai Institute of AI for Education, East China Normal University {lyu, jiaju lin}@stu.ecnu.edu.cn, {qchen, lhe}@cs.ecnu.edu.cn |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code is available at https://github.com/Bruth YU/BPT-VLM |
| Open Datasets | Yes | To evaluate the effectiveness of BPT-VLM, we conduct experiments on 9 visual image classification datasets: Image Net [Deng et al., 2009], Caltech101 [Fei-Fei et al., 2004], Oxford Pets [Parkhi et al., 2012], Flowers102 [Nilsback and Zisserman, 2008], Food101 [Bossard et al., 2014], UCF101 [Soomro et al., 2012], SUN397 [Xiao et al., 2010], Euro SAT [Helber et al., 2019] and DTD [Cimpoi et al., 2014]. |
| Dataset Splits | Yes | Following the few-shot setting adopted in [Zhou et al., 2022], all methods use the same 16-shot split for prompt tuning and are evaluated on full test-sets for comparison. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware (e.g., GPU models, CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions "Py CMA" and "Py Pop7" as open-source libraries used for implementation, but does not provide specific version numbers for these or other key software components like Python, PyTorch, or CUDA. |
| Experiment Setup | Yes | Table 1: Default Setting of Hyper-parameters includes Intrinsic Dimension 1000, Vision Prompt Length 8, Language Prompt Length 5, Population Size 30, and Loss Function Cross Entropy. |