FD-Align: Feature Discrimination Alignment for Fine-tuning Pre-Trained Models in Few-Shot Learning

Authors: Kun Song, Huimin Ma, Bochao Zou, Huishuai Zhang, Weiran Huang

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experimental results validate the efficacy of our approach for both ID and OOD tasks.
Researcher Affiliation Collaboration 1SCCE, University of Science and Technology Beijing 2Qing Yuan Research Institute, SEIEE, Shanghai Jiao Tong University 3Microsoft Research Asia
Pseudocode No The paper describes the proposed method in textual format but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code Yes Our code could be found in https:// github.com/skingorz/FD-Align.
Open Datasets Yes For the OOD setting, we evaluate our method on two different datasets. On the one hand, we train the model on Image Net [27] following the few-shot task in CLIP and test the performance on two OOD variants of Image Net [27]: Image Net V2 [28] and Image Net-Sketch [29] with the same 1000 classes, On the other hand, we follow the traditional few-shot learning strategy and fine-tune the model on the train split of mini Image Net and evaluate the model on Meta-dataset [30], BSCDFSL benchmark [31] and Domain Net [32], for a total of 19 datasets.
Dataset Splits Yes Figure 9 depicts the evolution of model accuracy and loss on the validation set throughout the fully fine-tuning and FD-Align processes.
Hardware Specification No The paper does not provide specific details on the hardware used, such as GPU models, CPU specifications, or memory.
Software Dependencies No The paper mentions using 'open source Vi T-B/32 as the backbone of the CLIP' and 'Open AI Image Net prompt templates' but does not specify software versions for any libraries, frameworks (e.g., PyTorch, TensorFlow), or programming languages.
Experiment Setup Yes Ltotal = α Lclass + β Lspurious, where we set α to 1 and β to 20 in this paper. ... we set n to 60, k to 20 in the spurious prototype correction stage. We employ the Stochastic Gradient Descent (SGD) optimizer for model fine-tuning, conducting the process over 60 epochs.