Task-Adaptive Prompted Transformer for Cross-Domain Few-Shot Learning

Authors: Jiamin Wu, Xin Liu, Xiaotian Yin, Tianzhu Zhang, Yongdong Zhang

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experimental results on the Meta-Dataset benchmark demonstrate that our method achieves superior results against state-of-the-art methods.
Researcher Affiliation Academia 1Deep Space Exploration Laboratory/School of Information Science and Technology, University of Science and Technology of China
Pseudocode No The paper describes methods in text and uses diagrams (Figure 1, Figure 2) but does not include structured pseudocode or algorithm blocks.
Open Source Code No The paper provides a link (https://github.com/hushell/pmf cvpr22) to the official code of a *baseline* method (PMF), but does not provide a link or statement about the availability of the code for their own proposed Meta Prompt method.
Open Datasets Yes We evaluate our model on Meta-Dataset (Triantafillou et al. 2019), a cross-domain few-shot learning benchmark that collects 10 public image datasets from a diverse range of domains: Im Net, Omni, Acraft, Bird, DTD, QDraw, Fungi, Flwr, Sign and COCO.
Dataset Splits Yes We use the first 8 datasets for meta-training, where each dataset is further divided into train/val/test splits with disjoint classes. The test split of these datasets is used to evaluate the performance of seen domains (in-domain performance).
Hardware Specification No The paper specifies model architectures used (Vi T-S, Vi T-B) and pre-trained weights, but does not provide specific details about the hardware (e.g., GPU models, CPU types, or memory) used to run the experiments.
Software Dependencies No The paper mentions optimizers (SGD, Adadelta) but does not provide specific version numbers for any software, libraries, or frameworks used in the experiments.
Experiment Setup Yes The length of the prompt is set as 8. We follow the episodic training protocol and use the SGD optimizer with a learning rate of 1e 4 for the Vi T backbone and 5e 4 for the prompt generator. The task-adaptive prompt is inserted into the second layer of Vi T, which we empirically found performs best. During meta-test, we randomly sample 600 N-way K-shot tasks from the meta-test split of each dataset, where N varies from 5 to 50 and K varies from 1 to 100. The bias parameters of the backbone are tuned for 30 iterations for each task, using the Adadelta optimizer with a learning rate of 1.