reproducibilityindex.ai

SA²VP: Spatially Aligned-and-Adapted Visual Prompt

Authors: Wenjie Pei, Tongqi Xia, Fanglin Chen, Jinsong Li, Jiandong Tian, Guangming Lu

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on three challenging benchmarks for image classification demonstrate the superiority of our model over other state-of-the-art methods for visual prompt tuning. Code is available at https://github.com/tommy-xq/SA2VP. [...] Experiments Experimental Setup Datasets. We conduct experiments on three challenging benchmarks across diverse scenes: FGVC, HTA and VTAB-1k (Zhai et al. 2019).
Researcher Affiliation	Collaboration	1Harbin Institute of Technology, Shenzhen 2Shenzhen Jiang & Associates Creative Design Co., Ltd 3Shenyang Institute of Automation, Chinese Academy of Sciences
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	Yes	Code is available at https://github.com/tommy-xq/SA2VP.
Open Datasets	Yes	We conduct experiments on three challenging benchmarks across diverse scenes: FGVC, HTA and VTAB-1k (Zhai et al. 2019). FGVC benchmark contains 5 image datasets, including CUB (Wah et al. 2011), NABirds (Van Horn et al. 2015), Oxford Flowers (Nilsback and Zisserman 2008), Stanford Dogs (Khosla et al. 2011) and Stanford Cars (Gebru et al. 2017). [...] HTA [...] including CIFAR10 (Krizhevsky, Hinton et al. 2009), CIFAR100 (Krizhevsky, Hinton et al. 2009), DTD (Cimpoi et al. 2014), CUB-200 (Wah et al. 2011), NABirds (Van Horn et al. 2015), Stanford-Dogs(Khosla et al. 2011), Oxford-Flowers (Nilsback and Zisserman 2008), Food101 (Bossard, Guillaumin, and Van Gool 2014), GTSRB (Stallkamp et al. 2012) and SVHN (Netzer et al. 2011).
Dataset Splits	Yes	We follow VPT to split data for training and test. [...] Using default train-val-test data split, we follow the experimental configuration of DAM-VP (Huang et al. 2023) to have a fair comparison. [...] Each dataset in VTAB-1k contains 1000 training images, among which 800 images are for training and the left is for validation (Zhai et al. 2019).
Hardware Specification	No	The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies	No	The paper only mentions the optimizer (Adam W) but does not provide specific software dependencies like programming languages, libraries, or frameworks with their version numbers.
Experiment Setup	Yes	Adam W (Loshchilov and Hutter 2017) is used for optimization with the initial learning rate 1e 3, weight decay 1e 4 and the batch size 64 or 128. Since all experiments are performed for image classification on all benchmarks, we use classification accuracy as the evaluation metric.