Revisit Finetuning strategy for Few-Shot Learning to Transfer the Emdeddings
Authors: Heng Wang, Tan Yue, Xiang Ye, Zihang He, Bohan Li, Yong Li
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To show the effectiveness of the designed LP-FT-FB, we conducted comprehensive experiments on the commonly used FSL datasets under different backbones for in-domain and cross-domain FSL tasks. The experimental results show that the proposed FT-LP-FB outperforms the SOTA FSL methods. |
| Researcher Affiliation | Academia | Anonymous authors Paper under double-blind review. No affiliations provided, therefore classification is not possible. |
| Pseudocode | No | The provided text does not contain any explicit pseudocode or algorithm blocks. It mentions that 'The whole flow of LP-FT-FB is given in Appendix.', but the Appendix content is not included here. |
| Open Source Code | Yes | The code is available at https://github.com/whzyf951620/ Linear Probing Finetuning Firth Bias. |
| Open Datasets | Yes | The experiments are evaluated on three typical FSL datasets, mini-Imagenet Vinyals et al. (2016), tiered-Imagenet Ren et al. (2018), and CUB Wah et al. (2011). |
| Dataset Splits | Yes | mini-Imagenet consists of 100 classes from the Image Net, which are split randomly into 64 base, 16 validation, and 20 novel classes. tiered-Imagenet consists of 608 classes from the Image Net, which are split randomly into 351 base, 97 validation, and 160 novel classes. CUB contains 200 classes with a total of 11,788 images of size 84 84. The base, validation, and novel split contain 100, 50, and 50 classes. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions optimizers like SGD and references other models, but it does not specify any software dependencies with version numbers (e.g., specific Python, PyTorch, or TensorFlow versions). |
| Experiment Setup | Yes | For LP, we used the linear classifier proposed in the Baseline++ Chen et al. (2019). For the optimizer, the SGD is used with the learning rate α1 = 0.01, momentum 0.9, dampening 0.9, and weight decay 1e 3. For the FBR of classifier, the factor λ in Eq. 2 is set to 1. For FT, the feature extractor and the classifier are together manually finetuned with the learning rate α2 = 1e 3 and the i-FBR factor λinv in Eq. 7 is set to 1e 3. |