IEPT: Instance-Level and Episode-Level Pretext Tasks for Few-Shot Learning
Authors: Manli Zhang, Jianhong Zhang, Zhiwu Lu, Tao Xiang, Mingyu Ding, Songfang Huang
ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments show that our proposed model (i.e., FSL with IEPT) achieves the new state-of-the-art. |
| Researcher Affiliation | Collaboration | Manli Zhang, Jianhong Zhang & Zhiwu Lu Gaoling School of Artificial Intelligence, Renmin University of China, Beijing, China Beijing Key Laboratory of Big Data Management and Analysis Methods, Beijing, China {manlizhang,jianhong,luzhiwu}@ruc.edu.cn Tao Xiang University of Surrey, Guildford, Surrey, UK t.xiang@surrey.ac.uk Mingyu Ding The University of Hong Kong, Hong Kong mingyuding@hku.hk Songfang Huang Alibaba DAMO Academy, Hangzhou, China songfang.hsf@alibaba-inc.com |
| Pseudocode | Yes | Algorithm 1 FSL with IEPT Input: The training set Ds, the rotation operator set G The loss weight hyperparameters w1, w2, w3 Output: The learned ψ |
| Open Source Code | No | We will release the code soon. |
| Open Datasets | Yes | Two widely-used FSL datasets are selected: mini Image Net (Vinyals et al., 2016) and tiered Image Net (Ren et al., 2018). Both datasets are subsets sampled from Image Net (Russakovsky et al., 2015). |
| Dataset Splits | Yes | The first dataset consists of a total number of 100 classes (600 images per class) and the train/validation/test split is set to 64/16/20 classes as in (Ravi & Larochelle, 2017). The second dataset is a larger dataset including 608 classes totally (nearly 1,200 images per class), which is split into 351/97/160 classes for train/validation/test. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for experiments. |
| Software Dependencies | No | PyTorch is used for our implementation. We utilize the Adam optimizer (Kingma & Ba, 2015) for Conv4-64 & Conv4-512 and the SGD optimizer for Res Net-12 to train our IEPT model. The paper mentions PyTorch and optimizers but does not specify their version numbers or other library versions. |
| Experiment Setup | Yes | We select the hyper-parameters w1, w2 and w3 from the candidate set {0.1, 0.5, 1.0, 5.0, 10.0} and show the hyper-parameter analysis results in Figure 6. We find that the performance of our IEPT is relatively stable. Concretely, the performance of our IEPT is not sensitive to w1 and w2 with proper values, but too large w1(i.e. w1 = 10.0) tends to cause obvious degradation, perhaps because the FSL task is biased by the rotation prediction loss. |