M2SD:Multiple Mixing Self-Distillation for Few-Shot Class-Incremental Learning

Authors: Jinhao Lin, Ziheng Wu, Weifeng Lin, Jun Huang, RongHua Luo

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments demonstrate that our approach achieves superior performance over previous state-of-the-art methods.
Researcher Affiliation Collaboration Jinhao Lin1, 2, Ziheng Wu2, Weifeng Lin1, 2, Jun Huang2, Rong Hua Luo1* 1 South China University of Technology 2 Alibaba Group
Pseudocode No No pseudocode or clearly labeled algorithm block was found in the paper.
Open Source Code No The paper does not contain any explicit statements about making the source code publicly available or provide a link to a code repository.
Open Datasets Yes We evaluate our method performance on three mainstream benchmark datasets: Caltech-UCSD Birds-200-2011 (CUB200) (Wah et al. 2011), CIFAR100 (Krizhevsky, Hinton et al. 2009), and mini Image Net (Russakovsky et al. 2015)
Dataset Splits Yes Dataset: Following the guidance of CEC, we divide each dataset into base and incremental sessions. For CUB200, the base session comprises 100 classes, and each of the 10 incremental sessions comprises 100 classes. Each incremental session consists of 10 classes, with 5 instances per class. For CIFAR100 and mini Image Net, the base session comprises 60 classes, and each of the 8 incremental sessions comprises 40 classes. Each incremental session consists of 5 classes, with 5 instances per class.
Hardware Specification Yes We employ stochastic gradient descent (SGD) with momentum for optimization with a learning rate 1e-3 and a batch size of 256 on a 4x A100 GPU.
Software Dependencies No Our implementation is based on Py Torch, and the choice of backbone model is determined according to TOPIC(Tao et al. 2020): Res Net-20(He et al. 2016) for CIFAR100, Res Net-18 for CUB200 and mini Image Net. The same general data augmentation methods as other methods are used.
Experiment Setup Yes Training Details: Our implementation is based on Py Torch, and the choice of backbone model is determined according to TOPIC(Tao et al. 2020): Res Net-20(He et al. 2016) for CIFAR100, Res Net-18 for CUB200 and mini Image Net. The same general data augmentation methods as other methods are used. We employ stochastic gradient descent (SGD) with momentum for optimization with a learning rate 1e-3 and a batch size of 256 on a 4x A100 GPU.