Focus Your Attention when Few-Shot Classification

Authors: Haoqing Wang, Shibo Jie, Zhihong Deng

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments show that our method can improve the performance of full or parameter-efficient fine-tuning methods on few-shot tasks.
Researcher Affiliation Academia Haoqing Wang Shibo Jie Zhi-Hong Deng School of Intelligence Science and Technology, Peking University {wanghaoqing, parsley, zhdeng}@pku.edu.cn
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes Code is available at https://github.com/Haoqing-Wang/FORT.
Open Datasets Yes The experiments are conducted on the few-shot benchmark from [58] (i.e., CUB [61], Cars [30], Places [75] and Plantae [59] datasets), and two other fine-gained datasets: Aircraft [39] and Pets [43].
Dataset Splits No The paper mentions 'We only use 50 tasks for quick hyper-parameter selection' which implies a validation step, but does not provide explicit training/validation/test dataset split percentages or absolute sample counts for each split needed to reproduce the data partitioning.
Hardware Specification Yes All experiments can be conducted on single Tesla V100 with 32GB memory.
Software Dependencies No The paper mentions using 'Adam W [37] optimizer' and 'SGD optimizer' but does not provide specific version numbers for software dependencies like Python, PyTorch, or other libraries.
Experiment Setup Yes We set the rank to 4 for Lo RA [24] and use 10 learnable prompts at each layer for VPT [26]. We mainly use Adam W [37] optimizer for fine-tuning... We set the batch size to 20, same as the number of classes... We set λ = 1 for DINO pre-trained model and λ = 50 for CLIP pre-trained model for simplicity. The other hyper-parameters, including learning rate, number of fine-tuning epochs, temperature τ and coefficient α, are changeable for different fine-tuning methods... To this end, we only provide their candidate values in Table 7.