reproducibilityindex.ai

Class Incremental Learning via Likelihood Ratio Based Task Prediction

Authors: Haowei Lin, Yijia Shao, Weinan Qian, Ningxin Pan, Yiduo Guo, Bing Liu

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	5 EXPERIMENTS, Table 1: CIL ACC (%). -XT": X number of tasks. The best result in each column is highlighted in bold. The baselines are divided into two groups via the dashed line. The first group contains non-replay methods, and the second group contains replay-based methods. Non-CL (non-continual learning) denotes pooling all tasks together to learn all classes as one task, which gives the performance upper bound for CIL. AIA is the average incremental ACC (%). Last is the ACC after learning the final task. See forgetting rate results in Appendix C.2. The pink rows also show the results of Non-CLPFI and TPLPFI, which use Dei T Pre-trained with Full Image Net. Table 2: CIL ACC (%) after learning the final task without pre-training (average over the five datasets used in Table 1). The detailed results are shown in Table 8 of Appendix D.2. Figure 2: Ablation Studies. Fig (a) illustrates the achieved ACC gain for each of the designed techniques on the five datasets; Fig (b) displays the average ACC results obtained from different choices of Et and Etc for eq. (7); Fig (c) showcases the results for various selections of Elogit for TPL in eq. (9).
Researcher Affiliation	Academia	1Institute for Artificial Intelligence, Peking University 2Stanford University 3Wangxuan Institute of Computer Technology, Peking University 4Department of Computer Science, University of Illinois at Chicago
Pseudocode	Yes	F.3 PSEUDO-CODE, Algorithm 1 Compute TPL Score with the t-th Task-specific Model M(t), Algorithm 2 CIL Training with TPL, Algorithm 3 CIL Testing with TPL
Open Source Code	Yes	The code of TPL is publicly available at https://github.com/linhaowei1/TPL.
Open Datasets	Yes	Datasets. To form a sequence of tasks in CIL experiments, we follow the common CIL setting. We split CIFAR-10 into 5 tasks (2 classes per task) (C10-5T). For CIFAR-100, we conduct two experiments: 10 tasks (10 classes per task) (C100-10T) and 20 tasks (5 classes per task) (C100-20T). For Tiny Image Net, we split 200 classes into 5 tasks (40 classes per task) (T-5T) and 10 tasks (20 classes per task) (T-10T). We set the replay buffer size for CIFAR-10 as 200 samples, and CIFAR-100 and Tiny Image Net as 2000 samples following Kim et al. (2023). Following the random class order protocol in Rebuffi et al. (2017), we randomly generate five different class orders for each experiment and report the averaged metrics over the 5 random orders. For a fair comparison, the class orderings are kept the same for all systems. Results on a larger dataset are given in Appendix D.1.
Dataset Splits	No	The paper mentions training and testing samples (e.g., '50,000 / 10,000 training / testing samples' for CIFAR-10) and the use of a replay buffer, but it does not specify explicit validation dataset splits (e.g., percentages or counts for a distinct validation set) or how data is partitioned into train/val/test beyond just training and testing.
Hardware Specification	Yes	We run all the experiments on NVIDIA Ge Force RTX-2080Ti GPU.
Software Dependencies	No	Our implementations are based on Ubuntu Linux 16.04 with Python 3.6. Our implementations are based on Ubuntu Linux 16.04 with Python 3.6. We give the comparison in running time in Table 13. We use HAT as the base as MORE, ROW and TPL all make use of HAT and MORE and ROW are the strongest baselines. https://scikit-learn.org/stable/ https://github.com/facebookresearch/faiss
Experiment Setup	Yes	To compare with the strongest baseline MORE and ROW, we follow their setup (Kim et al., 2022a; 2023) to set the training epochs as 20, 40, 15, 10 for CIFAR-10, CIFAR-100, T-5T, T-10T respectively. And we follow them to use SGD optimizer, the momentum of 0.9, the batch size of 64, the learning rate of 0.005 for C10-5T, T-5T, T-10T, C100-20T, and 0.001 for C100-10T. The only hyper-parameters used in our method TPL are γ, which is the temperature parameter for task-id prediction, and k, which is the hyper-parameter of d KNN(x, Buf ) in Equation (7). The value of γ and k are searched from {0.01, 0.05, 0.10, 0.50, 1.0, 2.0, 5.0, 10.0} and {1, 2, 5, 10, 50, 100}, respectively. We choose γ = 0.05, and k = 5 for all the experiments as they achieve the overall best results.