A Closer Look at Few-shot Classification Again

Authors: Xu Luo, Hao Wu, Ji Zhang, Lianli Gao, Jing Xu, Jingkuan Song

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this paper, we empirically prove that the training algorithm and the adaptation algorithm can be completely disentangled, which allows algorithm analysis and design to be done individually for each phase. Our meta-analysis for each phase reveals several interesting insights that may help better understand key aspects of few-shot classification and connections with other fields such as visual representation learning and transfer learning.
Researcher Affiliation Academia 1University of Electronic Science and Technology of China 2Harbin Institute of Technology Shenzhen.
Pseudocode No The paper does not contain any pseudocode or algorithm blocks.
Open Source Code Yes Code and pre-trained models (in Py Torch) are available at https://github.com/ Frankluox/Closer Look Again Few Shot.
Open Datasets Yes For training, we choose three datasets of different scales: the train split of mini Image Net (Vinyals et al., 2016) that contains 38400 images from 64 classes, the train split of Image Net (Deng et al., 2009) that contains more than 1 million images from 1000 classes, and a large-scale multimodal dataset Web Image Text (Radford et al., 2021) that contains 400 million (image, text) pairs.
Dataset Splits Yes For adaptation analysis experiments in Section 5, we partition Image Net and Quick Draw to have a 100-class validation set. The rest is used as the test set.
Hardware Specification No The paper does not explicitly describe the specific hardware (e.g., GPU models, CPU models, memory) used for running the experiments.
Software Dependencies No The paper mentions "Py Torch" as the framework used for code and models. However, it does not provide specific version numbers for PyTorch or any other software libraries, which is required for reproducibility.
Experiment Setup Yes All reimplemented models are trained for 60 epochs using SGD+Momemtum with cosine learning rate decay without restart. The initial learning rates are all set to 0.1. Training batchsize is 4 for meta-learning models and 256 for non-meta-learning models. The input image size is 84x84 for Conv-4 and Res Net-12 models and 224x224 for other models. We use random crop and horizontal flip as data augmentation at training. For experiments in Section 4.1, to make a fair comparison, we train CE and Mo Co for 150 epochs and train PN using a number of iterations that makes the number of seen samples equal. SGD+Momemtum with cosine learning rate decay without restart is used. The backbone used is Res Net-18. Learning rates are all set to 0.1. Training batchsize is 4 for PN and 256 for CE and Mo Co. The input image size is 84x84. During training, we use random crop and horizontal flip as data augmentation for CE and PN, and for Mo Co, we use the same set of data augmentations as in the Mo Co-v2 paper (Chen et al., 2020).