Towards Few-Shot Adaptation of Foundation Models via Multitask Finetuning
Authors: Zhuoyan Xu, Zhenmei Shi, Junyi Wei, Fangzhou Mu, Yin Li, Yingyu Liang
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We substantiate our theoretical claims with extensive empirical evidence. Further, we present results affirming our task selection algorithm adeptly chooses related finetuning tasks, providing advantages to the model performance on target tasks. We believe our study shed new light on the effective adaptation of foundation models to new tasks that lack abundant labels. Our code is available at https://github.com/Oliver XUZY/Foudation-Model_Multitask. |
| Researcher Affiliation | Academia | Zhuoyan Xu, Zhenmei Shi, Junyi Wei, Fangzhou Mu, Yin Li, Yingyu Liang University of Wisconsin-Madison {zhuoyan.xu,jwei53,fmu2,yin.li}@wisc.edu, {zhmeishi,yliang}@cs.wisc.edu |
| Pseudocode | Yes | Algorithm 1 Consistency-Diversity Task Selection. Algorithm 2 Multitask Finetuning. |
| Open Source Code | Yes | Our code is available at https://github.com/Oliver XUZY/Foudation-Model_Multitask. |
| Open Datasets | Yes | We use four few-shot learning benchmarks: mini Image Net (Vinyals et al., 2016), tiered Image Net (Ren et al., 2018), Domain Net (Peng et al., 2019) and Meta-dataset (Triantafillou et al., 2020). |
| Dataset Splits | Yes | Tasks used for finetuning are constructed by samples from the training split. Each task is formed by randomly sampling 15 classes, with every class drawing 1 or 5 support samples and 10 query samples. Target tasks are similarly constructed from the test set. |
| Hardware Specification | No | The paper mentions models like Vi T-B and Res Net50 and pretraining methods, but does not provide any specific details about the hardware (e.g., GPU models, CPU types, or memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions using the 'SGD optimizer with momentum 0.9' and concepts like 'cosine similarity', but it does not specify any software libraries or frameworks (e.g., PyTorch, TensorFlow, scikit-learn) with their version numbers. |
| Experiment Setup | Yes | We consider few-shot tasks consisting of N classes with K support samples and Q query samples per class (known as N-way K-shot). The goal is to classify the query samples based on the support samples. Tasks used for finetuning are constructed by samples from the training split. Each task is formed by randomly sampling 15 classes, with every class drawing 1 or 5 support samples and 10 query samples. Target tasks are similarly constructed from the test set. During multitask finetuning, we update all parameters in the model using a nearest centroid classifier... For optimization, we use the SGD optimizer with momentum 0.9, the learning rate is 1e-5 for CLIP and moco v3 pretrained models, and is 2e-6 for DINO v2 pretrained models. The models were finetuned over varying numbers of epochs in each scenario until they reached convergence. |