Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Parameter and Computation Efficient Transfer Learning for Vision-Language Pre-trained Models

Authors: Qiong Wu, Wei Yu, Yiyi Zhou, Shubin Huang, Xiaoshuai Sun, Rongrong Ji

NeurIPS 2023 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To validate DAS, we apply it to a bunch of representative VLP models, and conduct extensive experiments on a set of VL tasks. The experimental results not only show the great advantages of DAS in reducing computational complexity, e.g. 11.97% FLOPs of METER on VQA2.0, but also confirm its competitiveness against existing PETL methods in terms of parameter scale and performance.
Researcher Affiliation Academia 1 Key Laboratory of Multimedia Trusted Perception and Efficient Computing, Ministry of Education of China, Xiamen University, 361005, P.R. China. 2 Institute of Artificial Intelligence, Xiamen University, 361005, P.R. China.
Pseudocode Yes Algorithm 1 Dynamic Architecture Skipping
Open Source Code Yes Our source code is given in https://github. com/Doubted Steam/DAS.
Open Datasets Yes To validate DAS, we apply it to a set of VLP models, namely including [10], Vi LT [28] and La VIN [42] 2, on three VL benchmarks, namely VQA2.0 [14], NLVR2 [57] and Flickr30K [51].
Dataset Splits Yes We conduct experiments on VQA2.0 [14]. Instead of answering the question in open-ended natural language, it is converted into a classification task with 3, 129 classes. Following the previous setting [10, 28], the PETL methods and DAS are trained on the train and validation sets of VQA2.0, and we report the test-dev results from the online evaluation 4. Notably, the validation set is used during training for all methods.
Hardware Specification Yes We conduct all experiments with a single NVIDIA Tesla A100 GPU and the settings not mentioned are the same as Vi LT [28] and METER [10].
Software Dependencies No The paper does not provide specific version numbers for software dependencies such as Python, PyTorch, or CUDA.
Experiment Setup Yes Following the most conventional setting [17, 59], the width of hidden states in adapters is set to 96. And the hidden dimension of the adapter used for the skip connection is set to 192 to retain a certain capacity. The VLP model is first warmed up for one epoch. In this epoch, the subnetwork is randomly sampled according to the skipped number m. Then the search runs 2 epochs and the redundancy observation is executed at 10-th step per interval. Finally, the optimal architecture will be trained for another 10 epochs.