reproducibilityindex.ai

Parameter-Efficient Model Adaptation for Vision Transformers

Authors: Xuehai He, Chunyuan Li, Pengchuan Zhang, Jianwei Yang, Xin Eric Wang

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct an empirical study on each efficient model adaptation method focusing on its performance alongside parameter cost. Furthermore, we propose a parameter-efficient model adaptation framework, which first selects submodules by measuring local intrinsic dimensions and then projects them into subspace for further decomposition via a novel Kronecker Adaptation (KAdaptation) method. We experiment on 20 datasets under the few-shot setting and 7 image classification datasets under the full-shot setting.
Researcher Affiliation	Collaboration	Xuehai He1, Chuanyuan Li2, Pengchuan Zhang2, Jianwei Yang2, Xin Eric Wang1 1 UC Santa Cruz, 2Microsoft Research at Redmond
Pseudocode	No	No pseudocode or clearly labeled algorithm blocks are present in the paper.
Open Source Code	Yes	To facilitate future research, implementations of all the methods studied in this work are released at https://github.com/eric-ailab/PEVi T.
Open Datasets	Yes	For few-shot benchmark experiments, we conduct experiments on 20 image classification datasets from the ELEVATER benchmark (Li et al. 2022b)... For full-shot experiments, we summarize the results by computing the average performance on CIFAR10 (Krizhevsky and Hinton 2009), CIFAR100 (Krizhevsky and Hinton 2009), SUN397 (Xiao et al. 2010), DTD (Cimpoi et al. 2014), STL10 (Coates, Ng, and Lee 2011), FGVCAircraft (Maji et al. 2013), and FER2013 (Goodfellow et al. 2013).
Dataset Splits	Yes	We use the official split for each of these datasets.
Hardware Specification	Yes	For few-shot benchmark experiments, we conduct experiments on 20 image classification datasets from the ELEVATER benchmark (Li et al. 2022b) on four Quadro RTX A6000 GPUs.
Software Dependencies	No	The paper mentions optimizers like SGD and AdamW and notes automatic hyper-parameter tuning, but it does not specify versions for any programming languages or libraries (e.g., Python, PyTorch, TensorFlow).
Experiment Setup	Yes	For benchmark experiments, we use the SGD (Ruder 2016) optimizer with the learning rate and weight decay being automatically searched for all methods so that these two hyperparameters have the optimum combination. Training epochs are set via grid search. For intrinsic dimension experiments, we use the Adam W (Kingma and Ba 2014) as the optimizer, with the weight decay of 10 8, learning rate of 10 5, and batch size of 32 following the setting in Li et al. (2018).