M2SD:Multiple Mixing Self-Distillation for Few-Shot Class-Incremental Learning
Authors: Jinhao Lin, Ziheng Wu, Weifeng Lin, Jun Huang, RongHua Luo
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments demonstrate that our approach achieves superior performance over previous state-of-the-art methods. |
| Researcher Affiliation | Collaboration | Jinhao Lin1, 2, Ziheng Wu2, Weifeng Lin1, 2, Jun Huang2, Rong Hua Luo1* 1 South China University of Technology 2 Alibaba Group |
| Pseudocode | No | No pseudocode or clearly labeled algorithm block was found in the paper. |
| Open Source Code | No | The paper does not contain any explicit statements about making the source code publicly available or provide a link to a code repository. |
| Open Datasets | Yes | We evaluate our method performance on three mainstream benchmark datasets: Caltech-UCSD Birds-200-2011 (CUB200) (Wah et al. 2011), CIFAR100 (Krizhevsky, Hinton et al. 2009), and mini Image Net (Russakovsky et al. 2015) |
| Dataset Splits | Yes | Dataset: Following the guidance of CEC, we divide each dataset into base and incremental sessions. For CUB200, the base session comprises 100 classes, and each of the 10 incremental sessions comprises 100 classes. Each incremental session consists of 10 classes, with 5 instances per class. For CIFAR100 and mini Image Net, the base session comprises 60 classes, and each of the 8 incremental sessions comprises 40 classes. Each incremental session consists of 5 classes, with 5 instances per class. |
| Hardware Specification | Yes | We employ stochastic gradient descent (SGD) with momentum for optimization with a learning rate 1e-3 and a batch size of 256 on a 4x A100 GPU. |
| Software Dependencies | No | Our implementation is based on Py Torch, and the choice of backbone model is determined according to TOPIC(Tao et al. 2020): Res Net-20(He et al. 2016) for CIFAR100, Res Net-18 for CUB200 and mini Image Net. The same general data augmentation methods as other methods are used. |
| Experiment Setup | Yes | Training Details: Our implementation is based on Py Torch, and the choice of backbone model is determined according to TOPIC(Tao et al. 2020): Res Net-20(He et al. 2016) for CIFAR100, Res Net-18 for CUB200 and mini Image Net. The same general data augmentation methods as other methods are used. We employ stochastic gradient descent (SGD) with momentum for optimization with a learning rate 1e-3 and a batch size of 256 on a 4x A100 GPU. |