reproducibilityindex.ai

Compressed Video Prompt Tuning

Authors: Bing Li, Jiaxin Chen, Xiuguo Bao, Di Huang

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive evaluations on HMDB-51, UCF-101 and Something Something v2 demonstrate that CVPT remarkably outperforms the state-of-the-art counterparts, delivering a much better balance between accuracy and efficiency.
Researcher Affiliation	Academia	Bing Li1,2 Jiaxin Chen2 Xiuguo Bao3 Di Huang1,2 1SKLSDE, Beihang University, Beijing, China 2IRIP Lab, SCSE, Beihang University, Beijing, China 3CNCERT/CC, Beijing, China
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks. Figure 2 illustrates the framework but is not pseudocode.
Open Source Code	No	The paper does not provide concrete access to source code for the methodology described.
Open Datasets	Yes	HMDB-51 and UCF-101 are two relatively small datasets, which contain 6766 videos from 51 action categories and 13,320 videos from 101 categories, respectively. Something-Something v2 (SSv2) is a large-scale motion-centric video dataset, including 168,913 videos for training and 24,777 videos for validation from 174 categories.
Dataset Splits	Yes	Something-Something v2 (SSv2) is a large-scale motion-centric video dataset, including 168,913 videos for training and 24,777 videos for validation from 174 categories.
Hardware Specification	Yes	We adopt the original model configurations and train the prompt parameters using the Adam W optimizer [35] on 12 NVIDIA V100 GPUs.
Software Dependencies	No	The paper mentions software components like 'Adam W optimizer' and 'Vi T' and 'Swin Transformer' but does not provide specific version numbers for any software dependencies.
Experiment Setup	Yes	The base learning rate, weight decay and batch size are set to 1 × 10−3, 1 × 10−4 and 240, respectively. Additionally, we adopt a warm-up strategy within the first 5 training epochs.