IEPT: Instance-Level and Episode-Level Pretext Tasks for Few-Shot Learning

Authors: Manli Zhang, Jianhong Zhang, Zhiwu Lu, Tao Xiang, Mingyu Ding, Songfang Huang

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments show that our proposed model (i.e., FSL with IEPT) achieves the new state-of-the-art.
Researcher Affiliation Collaboration Manli Zhang, Jianhong Zhang & Zhiwu Lu Gaoling School of Artificial Intelligence, Renmin University of China, Beijing, China Beijing Key Laboratory of Big Data Management and Analysis Methods, Beijing, China {manlizhang,jianhong,luzhiwu}@ruc.edu.cn Tao Xiang University of Surrey, Guildford, Surrey, UK t.xiang@surrey.ac.uk Mingyu Ding The University of Hong Kong, Hong Kong mingyuding@hku.hk Songfang Huang Alibaba DAMO Academy, Hangzhou, China songfang.hsf@alibaba-inc.com
Pseudocode Yes Algorithm 1 FSL with IEPT Input: The training set Ds, the rotation operator set G The loss weight hyperparameters w1, w2, w3 Output: The learned ψ
Open Source Code No We will release the code soon.
Open Datasets Yes Two widely-used FSL datasets are selected: mini Image Net (Vinyals et al., 2016) and tiered Image Net (Ren et al., 2018). Both datasets are subsets sampled from Image Net (Russakovsky et al., 2015).
Dataset Splits Yes The first dataset consists of a total number of 100 classes (600 images per class) and the train/validation/test split is set to 64/16/20 classes as in (Ravi & Larochelle, 2017). The second dataset is a larger dataset including 608 classes totally (nearly 1,200 images per class), which is split into 351/97/160 classes for train/validation/test.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for experiments.
Software Dependencies No PyTorch is used for our implementation. We utilize the Adam optimizer (Kingma & Ba, 2015) for Conv4-64 & Conv4-512 and the SGD optimizer for Res Net-12 to train our IEPT model. The paper mentions PyTorch and optimizers but does not specify their version numbers or other library versions.
Experiment Setup Yes We select the hyper-parameters w1, w2 and w3 from the candidate set {0.1, 0.5, 1.0, 5.0, 10.0} and show the hyper-parameter analysis results in Figure 6. We find that the performance of our IEPT is relatively stable. Concretely, the performance of our IEPT is not sensitive to w1 and w2 with proper values, but too large w1(i.e. w1 = 10.0) tends to cause obvious degradation, perhaps because the FSL task is biased by the rotation prediction loss.