reproducibilityindex.ai

Towards Few-Shot Adaptation of Foundation Models via Multitask Finetuning

Authors: Zhuoyan Xu, Zhenmei Shi, Junyi Wei, Fangzhou Mu, Yin Li, Yingyu Liang

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We substantiate our theoretical claims with extensive empirical evidence. Further, we present results affirming our task selection algorithm adeptly chooses related finetuning tasks, providing advantages to the model performance on target tasks. We believe our study shed new light on the effective adaptation of foundation models to new tasks that lack abundant labels. Our code is available at https://github.com/Oliver XUZY/Foudation-Model_Multitask.
Researcher Affiliation	Academia	Zhuoyan Xu, Zhenmei Shi, Junyi Wei, Fangzhou Mu, Yin Li, Yingyu Liang University of Wisconsin-Madison {zhuoyan.xu,jwei53,fmu2,yin.li}@wisc.edu, {zhmeishi,yliang}@cs.wisc.edu
Pseudocode	Yes	Algorithm 1 Consistency-Diversity Task Selection. Algorithm 2 Multitask Finetuning.
Open Source Code	Yes	Our code is available at https://github.com/Oliver XUZY/Foudation-Model_Multitask.
Open Datasets	Yes	We use four few-shot learning benchmarks: mini Image Net (Vinyals et al., 2016), tiered Image Net (Ren et al., 2018), Domain Net (Peng et al., 2019) and Meta-dataset (Triantafillou et al., 2020).
Dataset Splits	Yes	Tasks used for finetuning are constructed by samples from the training split. Each task is formed by randomly sampling 15 classes, with every class drawing 1 or 5 support samples and 10 query samples. Target tasks are similarly constructed from the test set.
Hardware Specification	No	The paper mentions models like Vi T-B and Res Net50 and pretraining methods, but does not provide any specific details about the hardware (e.g., GPU models, CPU types, or memory) used for running the experiments.
Software Dependencies	No	The paper mentions using the 'SGD optimizer with momentum 0.9' and concepts like 'cosine similarity', but it does not specify any software libraries or frameworks (e.g., PyTorch, TensorFlow, scikit-learn) with their version numbers.
Experiment Setup	Yes	We consider few-shot tasks consisting of N classes with K support samples and Q query samples per class (known as N-way K-shot). The goal is to classify the query samples based on the support samples. Tasks used for finetuning are constructed by samples from the training split. Each task is formed by randomly sampling 15 classes, with every class drawing 1 or 5 support samples and 10 query samples. Target tasks are similarly constructed from the test set. During multitask finetuning, we update all parameters in the model using a nearest centroid classifier... For optimization, we use the SGD optimizer with momentum 0.9, the learning rate is 1e-5 for CLIP and moco v3 pretrained models, and is 2e-6 for DINO v2 pretrained models. The models were finetuned over varying numbers of epochs in each scenario until they reached convergence.