AutoVP: An Automated Visual Prompting Framework and Benchmark
Authors: Hsi-Ai Tsao, Lei Hsiung, Pin-Yu Chen, Sijia Liu, Tsung-Yi Ho
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our extensive experimental results show that Auto VP outperforms the best-known current VP methods by a substantial margin, having up to 6.7% improvement in accuracy; and attains a maximum performance increase of 27.5% compared to linear-probing (LP) baseline. |
| Researcher Affiliation | Collaboration | Hsi-Ai Tsao1*, Lei Hsiung2*, Pin-Yu Chen3, Sijia Liu4, Tsung-Yi Ho5 1 National Tsing Hua University, 2 Dartmouth College, 3 IBM Research 4 Michigan State University, 5 The Chinese University of Hong Kong |
| Pseudocode | Yes | Algorithm 1: Freq Map (δ, fθs, Dt, m) Input: visual prompts δ, source classifier fθs, target dataset Dt, and the specified number of source classes mapped to each target class m Output: mapping matrix M |
| Open Source Code | Yes | The source code is available at https://github.com/IBM/Auto VP. |
| Open Datasets | Yes | We evaluated the performance of Auto VP on 12 downstream datasets (CIFAR10, CIFAR100, ISIC, SVHN, GTSRB, Flowers102, DTD, Food101, Euro SAT, Oxford IIITPet, UCF101, and FMo W), which are common datasets when measuring transfer learning generalization. Detailed descriptions of these datasets are given in Appendix B.1. |
| Dataset Splits | Yes | To speed up the tuning operation and save computational resources, we use Ray Tune (Liaw et al., 2018) along with an early-stop strategy for terminating poor trails. |
| Hardware Specification | Yes | Our experiments were performed on NVIDIA Ge Force RTX 3090 and are implemented with Py Torch. |
| Software Dependencies | No | Our experiments were performed on NVIDIA Ge Force RTX 3090 and are implemented with Py Torch. The input scaling module... is implemented using kornia.geometry.transform() from the Kornia library (Riba et al., 2020). ...we use Ray Tune (Liaw et al., 2018) along with an early-stop strategy... An ASHA scheduler (Li et al., 2018) was used... |
| Experiment Setup | Yes | We repeated each Auto VP experiment in triplicate, utilizing a learning rate of 40 with the SGD optimizer for CLIP, and a learning rate of 0.001 with the Adam optimizer for the other pre-trained models. |