Strong Baselines for Parameter-Efficient Few-Shot Fine-Tuning
Authors: Samyadeep Basu, Shell Hu, Daniela Massiceti, Soheil Feizi
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In our paper, we conduct a largescale, experimentally consistent, empirical analysis to study PEFTs for few-shot image classification. Through a battery of over 1.8k controlled experiments on large-scale few-shot benchmarks including META-DATASET (MD) and ORBIT, we uncover novel insights on PEFTs that cast light on their efficacy in fine-tuning Vi Ts for few-shot classification. |
| Researcher Affiliation | Collaboration | Samyadeep Basu1, Shell Hu3, Daniela Massiceti2, Soheil Feizi1 1University of Maryland, College Park 2Microsoft Research, Cambridge 3Samsung Research, Cambridge |
| Pseudocode | No | No pseudocode or algorithm blocks are present in the provided main paper text. |
| Open Source Code | No | We provide a Py Torch-like implementation in the Appendix. |
| Open Datasets | Yes | We run all our experiments on two challenging large-scale few-shot classification benchmarks (i) META-DATASET (Triantafillou et al. 2019) and (ii) ORBIT (Massiceti et al. 2021). |
| Dataset Splits | Yes | The validation set is a fixed set of 5 few-shot tasks sampled from the downstream dataset to which the Vi T is being adapted. |
| Hardware Specification | No | The paper mentions 'expensive in time, compute and storage' but does not provide specific hardware details such as GPU/CPU models, memory, or cloud instance types. |
| Software Dependencies | No | The paper mentions 'Adam (Kingma and Ba 2014)' as an optimizer but does not specify any software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | Following (Hu et al. 2022), we choose a learning rate from {0.0001, 0.001, 0.01, 0.1} and select the rate that gives the best performance on the validation set. ... For each few-shot task, we fine-tune for 40 steps with Adam (Kingma and Ba 2014) using the selected learning rate. |