Prediction Error-based Classification for Class-Incremental Learning

Authors: Michał Zając, Tinne Tuytelaars, Gido M van de Ven

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We present an experimental evaluation of PEC on several class-incremental learning benchmarks.
Researcher Affiliation Academia Michał Zaj ac KU Leuven, ESAT-PSI Jagiellonian University Tinne Tuytelaars KU Leuven, ESAT-PSI Gido M. van de Ven KU Leuven, ESAT-PSI
Pseudocode Yes Algorithm 1 Training of PEC; Algorithm 2 Inference using PEC
Open Source Code Yes 1We release source code for our experiments: https://github.com/michalzajac-ml/pec
Open Datasets Yes We conduct comprehensive experiments on CIL sequences utilizing the MNIST, SVHN, CIFAR-10, CIFAR-100, and mini Image Net datasets. For MNIST (Deng, 2012), Balanced SVHN2 (Netzer et al., 2011), and CIFAR-10 (Krizhevsky et al., 2009), we use (5/2) and (10/1) splits. For CIFAR-100 (Krizhevsky et al., 2009), we use (10/10) and (100/1) splits. For mini Image Net (Vinyals et al., 2016), we use (20/5) and (100/1) splits.
Dataset Splits Yes To select hyperparameters for our benchmarking experiments presented in Tables 1 and 2, for every combination of method, dataset, and task split, we perform a separate grid search. For every considered set of hyperparameters, we perform one experiment with a single random seed. We then select the hyperparameter values that yield the highest final validation set accuracy, and we use those values for our benchmarking experiments with ten different random seeds.
Hardware Specification Yes For running experiments, we used Slurm-based clusters with heterogenous hardware, including nodes with A100 GPUs, V100 GPUs, and CPUs only.
Software Dependencies No The paper mentions software like Py Torch and Mammoth but does not provide specific version numbers for these software components.
Experiment Setup Yes All experiments follow the class-incremental learning scenario, and, unless otherwise noted, for each benchmark we only allow a single pass through the data, i.e. each training sample is seen only once except if it is stored in the replay buffer. Performance is evaluated as the final test accuracy after training on all classes. We adopt Adam (Kingma & Ba, 2014) as the optimizer. We provide additional details about the experimental setup in Appendix B. For example, in Appendix B.3, it states: "learning rate (lr) {0.00003, 0.0001, 0.0003, 0.001, 0.003, 0.01, 0.03}, batch size (bs) {1, 10, 32}".