reproducibilityindex.ai

Prediction Error-based Classification for Class-Incremental Learning

Authors: Michał Zając, Tinne Tuytelaars, Gido M van de Ven

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We present an experimental evaluation of PEC on several class-incremental learning benchmarks.
Researcher Affiliation	Academia	Michał Zaj ac KU Leuven, ESAT-PSI Jagiellonian University Tinne Tuytelaars KU Leuven, ESAT-PSI Gido M. van de Ven KU Leuven, ESAT-PSI
Pseudocode	Yes	Algorithm 1 Training of PEC; Algorithm 2 Inference using PEC
Open Source Code	Yes	1We release source code for our experiments: https://github.com/michalzajac-ml/pec
Open Datasets	Yes	We conduct comprehensive experiments on CIL sequences utilizing the MNIST, SVHN, CIFAR-10, CIFAR-100, and mini Image Net datasets. For MNIST (Deng, 2012), Balanced SVHN2 (Netzer et al., 2011), and CIFAR-10 (Krizhevsky et al., 2009), we use (5/2) and (10/1) splits. For CIFAR-100 (Krizhevsky et al., 2009), we use (10/10) and (100/1) splits. For mini Image Net (Vinyals et al., 2016), we use (20/5) and (100/1) splits.
Dataset Splits	Yes	To select hyperparameters for our benchmarking experiments presented in Tables 1 and 2, for every combination of method, dataset, and task split, we perform a separate grid search. For every considered set of hyperparameters, we perform one experiment with a single random seed. We then select the hyperparameter values that yield the highest final validation set accuracy, and we use those values for our benchmarking experiments with ten different random seeds.
Hardware Specification	Yes	For running experiments, we used Slurm-based clusters with heterogenous hardware, including nodes with A100 GPUs, V100 GPUs, and CPUs only.
Software Dependencies	No	The paper mentions software like Py Torch and Mammoth but does not provide specific version numbers for these software components.
Experiment Setup	Yes	All experiments follow the class-incremental learning scenario, and, unless otherwise noted, for each benchmark we only allow a single pass through the data, i.e. each training sample is seen only once except if it is stored in the replay buffer. Performance is evaluated as the final test accuracy after training on all classes. We adopt Adam (Kingma & Ba, 2014) as the optimizer. We provide additional details about the experimental setup in Appendix B. For example, in Appendix B.3, it states: "learning rate (lr) {0.00003, 0.0001, 0.0003, 0.001, 0.003, 0.01, 0.03}, batch size (bs) {1, 10, 32}".