reproducibilityindex.ai

Historical Test-time Prompt Tuning for Vision Foundation Models

Authors: Jingyi Zhang, Jiaxing Huang, Xiaoqin Zhang, Ling Shao, Shijian Lu

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments show that His TPT achieves superior prompt tuning performance consistently while handling different visual recognition tasks (e.g., image classification, semantic segmentation, and object detection) and test samples from continuously changing domains.
Researcher Affiliation	Collaboration	1 College of Computing and Data Science, Nanyang Technological University, Singapore 2 College of Computer Science and Technology, Zhejiang University of Technology, China 3 UCAS-Terminus AI Lab, University of Chinese Academy of Sciences, China
Pseudocode	Yes	We provide the pseudo codes of the proposed historical test-time prompt tuning (His TPT), as shown in Algorithm 1.
Open Source Code	No	Code will be released after being accepted.
Open Datasets	Yes	We evaluate His TPT over multiple datasets across three widely studied visual recognition tasks: Semantic Segmentation: We benchmark His TPT over 6 image segmentation datasets with pixelwise annotations, including Cityscapes [16], BDD100K [67], Mapillary [68], ADE20K [69], Pascal Content [70] and ACDC [17].
Dataset Splits	No	The paper mentions 'test samples' and a 'continuous flow' of data but does not explicitly specify traditional train/validation/test splits used for model development or evaluation, nor does it refer to predefined validation splits for the datasets used.
Hardware Specification	Yes	All the experiments are conducted on one NVIDIA Tesla V100 GPU with batch size 1.
Software Dependencies	No	The paper mentions software components like 'Adam W optimizer' and models like 'SEEM' and 'CLIP' but does not provide specific version numbers for these software dependencies (e.g., PyTorch, TensorFlow, or specific library versions).
Experiment Setup	Yes	In training, we employ Adam W optimizer [84] with a weight decay of 0.05, and set the initial learning rate as 0.0001. For all experiments, the prompt is initialized as a photo of a and the corresponding 4 tokens (i.e., M = 4) of dimension D = 512 are optimized as in [7, 8]. Unless otherwise specified, we set the size of the local knowledge bank and hard-sample knowledge bank at L = H = 32 and the number of the selected hard-sample features K at 16. We set the update coefficient γ of the global knowledge bank at 0.99. Following [7], we set the optimization step in test-time prompt tuning at 1 by default.