Self-Prompt Mechanism for Few-Shot Image Recognition

Authors: Mingchen Song, Huiqiang Wang, Guoqiang Zhong

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments demonstrate the effectiveness of the proposed SPM in both 5-way 1-shot and 5-way 5-shot settings for standard single-domain and cross-domain few-shot recognition datasets, respectively. Our code is available at https://github.com/codeshop715/SPM.
Researcher Affiliation Academia Mingchen Song*, Huiqiang Wang*, Guoqiang Zhong College of Computer Science and Technology, Ocean University of China songmingchen@stu.ouc.edu.cn, wanghuiqiang@stu.ouc.edu.cn, gqzhong@ouc.edu.cn
Pseudocode No The paper describes the self-prompt mechanism and related processes in textual form and through diagrams, but does not provide structured pseudocode or an algorithm block.
Open Source Code Yes Our code is available at https://github.com/codeshop715/SPM.
Open Datasets Yes We employ two standard benchmarks to evaluate our proposed SPM method, including Mini-Image Net (Vinyals et al. 2016) and CIFAR-FS (Bertinetto et al. 2018).
Dataset Splits Yes Mini-Image Net contains 100 classes, which is divided into 64 classes for training, 16 for validation, and 20 for testing. CIFAR-FS is a few-shot image recognition dataset built on CIFAR100. We follow the split division proposed by (Hu et al. 2022), where the dataset is divided into 64 classes for training, 16 for validation, and 20 for testing. Each class comprises 100 images.
Hardware Specification Yes We use a single Nvidia Ge Force 4090 for all the experiments.
Software Dependencies No The paper mentions using ViT and CLIP models, but does not specify software dependencies with version numbers (e.g., Python, PyTorch, CUDA versions).
Experiment Setup Yes We employ Vi T as our backbone network, and the backbone is trained for 20 epochs using Vi T-small and 80 epochs using Vi T-base, each epoch consisting of 2000 episodes. Our learning rate schedule incorporates warm-up and cosine annealing, with the learning rate commencing at 10 6, surging to 5 10 5 in 5 epochs, and gradually tapering off to 10 6 via cosine annealing. In order to attain the finest test outcomes, we utilize the early stop strategy to train our model. We use a single Nvidia Ge Force 4090 for all the experiments.