AutoVP: An Automated Visual Prompting Framework and Benchmark

Authors: Hsi-Ai Tsao, Lei Hsiung, Pin-Yu Chen, Sijia Liu, Tsung-Yi Ho

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our extensive experimental results show that Auto VP outperforms the best-known current VP methods by a substantial margin, having up to 6.7% improvement in accuracy; and attains a maximum performance increase of 27.5% compared to linear-probing (LP) baseline.
Researcher Affiliation Collaboration Hsi-Ai Tsao1*, Lei Hsiung2*, Pin-Yu Chen3, Sijia Liu4, Tsung-Yi Ho5 1 National Tsing Hua University, 2 Dartmouth College, 3 IBM Research 4 Michigan State University, 5 The Chinese University of Hong Kong
Pseudocode Yes Algorithm 1: Freq Map (δ, fθs, Dt, m) Input: visual prompts δ, source classifier fθs, target dataset Dt, and the specified number of source classes mapped to each target class m Output: mapping matrix M
Open Source Code Yes The source code is available at https://github.com/IBM/Auto VP.
Open Datasets Yes We evaluated the performance of Auto VP on 12 downstream datasets (CIFAR10, CIFAR100, ISIC, SVHN, GTSRB, Flowers102, DTD, Food101, Euro SAT, Oxford IIITPet, UCF101, and FMo W), which are common datasets when measuring transfer learning generalization. Detailed descriptions of these datasets are given in Appendix B.1.
Dataset Splits Yes To speed up the tuning operation and save computational resources, we use Ray Tune (Liaw et al., 2018) along with an early-stop strategy for terminating poor trails.
Hardware Specification Yes Our experiments were performed on NVIDIA Ge Force RTX 3090 and are implemented with Py Torch.
Software Dependencies No Our experiments were performed on NVIDIA Ge Force RTX 3090 and are implemented with Py Torch. The input scaling module... is implemented using kornia.geometry.transform() from the Kornia library (Riba et al., 2020). ...we use Ray Tune (Liaw et al., 2018) along with an early-stop strategy... An ASHA scheduler (Li et al., 2018) was used...
Experiment Setup Yes We repeated each Auto VP experiment in triplicate, utilizing a learning rate of 40 with the SGD optimizer for CLIP, and a learning rate of 0.001 with the Adam optimizer for the other pre-trained models.