POUF: Prompt-Oriented Unsupervised Fine-tuning for Large Pre-trained Models

Authors: Korawat Tanwisuth, Shujian Zhang, Huangjie Zheng, Pengcheng He, Mingyuan Zhou

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To verify our approach s applicability, we conduct extensive experiments on image classification, sentiment analysis, and natural language inference tasks. Across 13 image-related tasks and 15 language-related ones, the proposed approach achieves consistent improvements over the baselines.
Researcher Affiliation Collaboration 1The University of Texas at Austin 2Microsoft Azure AI.
Pseudocode Yes Algorithm 1 POUF Pseudocode for language-augmented vision models, Py Torch-like
Open Source Code Yes Py Torch code is available at https://github.com/ korawat-tanwisuth/POUF.
Open Datasets Yes Office-31 (Saenko et al., 2010) contains 4,652 images with 31 classes from three domains: Amazon (A), Webcam (W), and DSLR (D). GLUE benchmark (Wang et al., 2018)
Dataset Splits Yes Specifically, for each task, the data is split into Dtrain, Ddev, and Dtest. The authors tune the hyper-parameters on Ddev and report the performance of the model on Dtest. We validate the model s performance every 100 steps on Ddev and take the best validated checkpoint for the final evaluation on Dtest.
Hardware Specification Yes All experiments are conducted using a single Nvidia Tesla V100 GPU.
Software Dependencies No The paper mentions using Py Torch code and libraries like CLIP and TLlib but does not specify their version numbers.
Experiment Setup Yes The learning rate schedule is set to ηiter = η0(1 + γiter) α, where η0 is the initial learning rate. We adopt the following default hyper-parameters: γ = 2e 4, and α = 0.75. We set η0 = 5e 7 for all experiments except for prompt tuning on Office-31 where η0 = 1e 3. We use a mini-batch SGD with a momentum of 0.9 and a batch size of 96 for Office31 and Office-Home and 16 for Domain Net. The weight of the mutual-information objective, λ, is set to 0.3 for all experiments.