A Closer Look at Few-shot Classification

Authors: Wei-Yu Chen, Yen-Cheng Liu, Zsolt Kira, Yu-Chiang Frank Wang, Jia-Bin Huang

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this paper, we present 1) a consistent comparative analysis of several representative few-shot classification algorithms, with results showing that deeper backbones significantly reduce the performance differences among methods on datasets with limited domain differences, 2) a modified baseline method that surprisingly achieves competitive performance when compared with the state-of-the-art on both the mini Image Net and the CUB datasets, and 3) a new experimental setting for evaluating the cross-domain generalization ability for few-shot classification algorithms. Our results reveal that reducing intra-class variation is an important factor when the feature backbone is shallow, but not as critical when using deeper backbones. In a realistic cross-domain evaluation setting, we show that a baseline method with a standard fine-tuning practice compares favorably against other state-of-the-art few-shot learning algorithms.
Researcher Affiliation Academia Wei-Yu Chen Carnegie Mellon University weiyuc@andrew.cmu.eduYen-Cheng Liu & Zsolt Kira Georgia Tech {ycliu,zkira}@gatech.eduYu-Chiang Frank Wang National Taiwan University ycwang@ntu.edu.twJia-Bin Huang Virginia Tech jbhuang@vt.edu
Pseudocode No The paper describes methods in text and uses conceptual diagrams (Figures 1 and 2), but does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes Through making the source code and model implementations with a consistent evaluation setting publicly available, we hope to foster future progress in the field.1 1https://github.com/wyharveychen/Closer Look Few Shot
Open Datasets Yes For object recognition, we use the mini-Image Net dataset commonly used in evaluating few-shot classification algorithms. The mini-Image Net dataset consists of a subset of 100 classes from the Image Net dataset Deng et al. (2009) and contains 600 images for each class. The dataset was first proposed by Vinyals et al. (2016), but recent works use the follow-up setting provided by Ravi & Larochelle (2017), which is composed of randomly selected 64 base, 16 validation, and 20 novel classes. For fine-grained classification, we use CUB-200-2011 dataset Wah et al. (2011) (referred to as the CUB hereafter). The CUB dataset contains 200 classes and 11,788 images in total. Following the evaluation protocol of Hilliard et al. (2018), we randomly split the dataset into 100 base, 50 validation, and 50 novel classes.
Dataset Splits Yes The mini-Image Net dataset ... composed of randomly selected 64 base, 16 validation, and 20 novel classes. For fine-grained classification, we use CUB-200-2011 dataset ... we randomly split the dataset into 100 base, 50 validation, and 50 novel classes. For the cross-domain scenario (mini-Image Net CUB), we use mini-Image Net as our base class and the 50 validation and 50 novel class from CUB.
Hardware Specification No The paper does not provide specific hardware details such as GPU or CPU models used for running the experiments.
Software Dependencies No The paper mentions using the 'Adam optimizer' but does not specify version numbers for any software dependencies, libraries, or programming languages.
Experiment Setup Yes In the training stage for the Baseline and the Baseline++ methods, we train 400 epochs with a batch size of 16. In the meta-training stage for meta-learning methods, we train 60,000 episodes for 1-shot and 40,000 episodes for 5-shot tasks. ... In each episode, we sample N classes to form N-way classification (N is 5 in both meta-training and meta-testing stages unless otherwise mentioned). For each class, we pick k labeled instances as our support set and 16 instances for the query set for a k-shot task. ... All methods are trained from scratch and use the Adam optimizer with initial learning rate 10 3. We apply standard data augmentation including random crop, left-right flip, and color jitter in both the training or meta-training stage.