Margin-Based Few-Shot Class-Incremental Learning with Class-Level Overfitting Mitigation

Authors: Yixiong Zou, Shanghang Zhang, Yuhua Li, Ruixuan Li

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on CIFAR100, Caltech-USCD Birds200-2011 (CUB200), and mini Image Net demonstrate that the proposed method effectively mitigates the CO problem and achieves state-of-the-art performance.
Researcher Affiliation Academia Yixiong Zou1, Shanghang Zhang2, Yuhua Li1 and Ruixuan Li1 1School of Computer Science and Technology, Huazhong University of Science and Technology 2School of Computer Science, Peking University 1{yixiongz, idcliyuhua, rxli}@hust.edu.cn, 2shanghang@pku.edu.cn
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes The implementation is based on CEC s code [28], and our code will be released3. 3https://github.com/Zoilsen/CLOM
Open Datasets Yes Datasets include CIFAR100 [15], Caltech-UCSD Birds-200-2011 (CUB200) [24] and mini Image Net [23] as listed in Tab. 3 following the split in [22].
Dataset Splits Yes CIFAR100 contains 100 classes in all. As split by [22], 60 classes are chosen as base classes, and the remaining 40 classes (with 5 training samples in each class) are chosen as novel classes2. Table 3: Evaluation datasets. Dataset Total Classes Base Classes Novel Classes Incremental Sessions Novel-Class Shot Input Size CIFAR100 100 60 40 8 5 32 32 CUB200 200 100 100 10 5 224 224 mini Image Net 100 60 40 8 5 84 84
Hardware Specification No No specific hardware details (e.g., GPU models, CPU types, memory) used for running experiments were mentioned in the paper.
Software Dependencies No The paper mentions that 'The implementation is based on CEC s code [28]', but does not provide specific version numbers for software dependencies such as libraries or frameworks.
Experiment Setup Yes For CIFAR100, we set d P =256, set mave=-0.2, set mupper=-0.5, and we have m P ave=0.1 and m P upper=0.2. For CUB200, we scale the learning rate of the backbone network to 10% of the global learning rate since the pretraining of the backbone is adopted [30, 28], and set d P to 8192. Then we have mave=-0.2 and mupper=-0.25 and m P ave=0.3 and m P upper=0.6. For mini Image Net, we set d P to 4096, and have mave=-0.2 and mupper=-0.5 and m P ave=0.1 and m P upper=0.2.