Few-Shot Class-Incremental Learning via Relation Knowledge Distillation

Authors: Songlin Dong, Xiaopeng Hong, Xiaoyu Tao, Xinyuan Chang, Xing Wei, Yihong Gong1255-1263

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental A large number of experiments demonstrate that relation knowledge does exist in the exemplars and our approach outperforms other state-ofthe-art class-incremental learning methods on the CIFAR100, mini Image Net, and CUB200 datasets.
Researcher Affiliation Academia Songlin Dong1 #, Xiaopeng Hong2 #, Xiaoyu Tao1, Xinyuan Chang3, Xing Wei3, Yihong Gong3* 1College of Artificial Intelligence, Xi an Jiaotong University 2School of Cyber Science and Engineering, Xi an Jiaotong University 3School of Software Engineering, Xi an Jiaotong University {dsl972731417,txy666793,cxy19960919}@stu.xjtu.edu.cn {hongxiaopeng,weixing,ygong}@mail.xjtu.edu.cn
Pseudocode No The paper describes the methods in prose and mathematical equations but does not include a structured pseudocode or algorithm block.
Open Source Code No The paper does not provide any statement or link regarding the availability of its source code.
Open Datasets Yes We conduct experiments on there image classification datasets CIFAR100 (Krizhevsky and Hinton 2009),mini Image Net (Vinyals et al. 2016a) and CUB200 (Wah et al. 2011).
Dataset Splits Yes For CIFAR100 and mini Image Net datasets, 60 classes as the base classes, and the other 40 classes are equally divided for incremental learning. We adopt the 5-way 5-shot setting so that we have 9 training tasks in total. While for the CUB200 dataset, we adopt the 10-way 5-shot setting, by picking 100 classes as a base class and choosing the other classes into 10 new learning tasks. For all datasets, we randomly pick 5 samples per class from the original dataset for training set. At the same time, the testing set still uses the original one, which is large enough to ensure that generalization performance is evaluated to prevent over-fitting.
Hardware Specification No The paper mentions using PyTorch and ResNet18/ResNet20 as the backbone network, but does not specify the hardware (e.g., CPU, GPU model) used for training or inference.
Software Dependencies No The paper mentions 'Py Torch' but does not provide specific version numbers for it or any other software dependencies.
Experiment Setup Yes Training details: All our models are implemented through Py Torch and use Res Net18 or Res Net20 as our backbone network. For CIFAR100 and mini Image Net, we trained the basic model for 160 epochs using minibatch SGD with a minibatch size of 128. The learning rate is initialized to 0.1, and decreases to 0.01 and 0.001 at the 80 and 120 periods, respectively. For the CUB200 dataset, we use pre-trained Res Net18 and train the basic model with an initial learning rate of 0.05. After 15 epochs, we reduce the learning rate to 0.005 and stop training in the period 20. For each new task, we finetune our model with a learning rate of 1e 4 for 50 epochs for all three datasets. In the FSCIL setting, since the new task contains very few training samples, we use them all to construct mini-batches for incremental learning. For data augmentation, we perform standard random cropping and flipping as in (He et al. 2015; Hou et al. 2019) for all the methods and add Color Jitter on mini Image Net.