OVOR: OnePrompt with Virtual Outlier Regularization for Rehearsal-Free Class-Incremental Learning

Authors: Wei-Cheng Huang, Chun-Fu Chen, Hsiang Hsu

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 5 EXPERIMENTAL RESULTS
Researcher Affiliation Industry Wei-Cheng Huang, Chun-Fu (Richard) Chen, Hsiang Hsu Global Technology Applied Research, JPMorgan Chase, USA {wei-cheng.huang,richard.cf.chen,hsiang.hsu}@jpmchase.com
Pseudocode Yes Algorithm 1 OVOR: One Prompt with Virtual Outlier Regularization
Open Source Code Yes Our source code is available at https://github.com/jpmorganchase/ovor.
Open Datasets Yes Datasets. For evaluation, we use four widely-used datasets, Image Net-R (Hendrycks et al., 2021a) and CIFAR-100 (Krizhevsky et al., 2009), Image Net-A (Hendrycks et al., 2021b), CUB-200 (Wah et al., 2011).
Dataset Splits No The paper describes how datasets are split into tasks with training and test sets (Dt train, Dt test) for class-incremental learning, but does not explicitly mention a separate validation dataset split used for model selection or hyperparameter tuning.
Hardware Specification No The paper does not provide specific hardware details such as GPU or CPU models used for running experiments.
Software Dependencies No The paper mentions software components like 'Adam optimizer' and 'timm library' but does not specify their version numbers or other software dependencies with specific versions.
Experiment Setup Yes Training Details. For experimental configurations of CIL training, we follow Smith et al. (2023a) as much as possible, using the Adam (Kingma & Ba, 2015) optimizer, and batch size of 128. The model is trained 50 epochs per task on Image Net-R and 20 for other three datasets. ... For the regularization loss, we set λ to 0.1, same as Liu et al. (2020). We use τCurrent = 24.0 and τOutlier = 3.0 for the two threshold hyperparameters in the loss term, and δ = 1.0 as in the Huber loss Lδ, which is a component of the regularization loss. ... We use 1.0, 10, 160, 100 for σ, α, β, and K, respectively. These are the configuration for Image Net-R, for detailed hyperparameters settings for all datasets, please refer to Table C.17 and Table C.18 in the appendix.