OVOR: OnePrompt with Virtual Outlier Regularization for Rehearsal-Free Class-Incremental Learning
Authors: Wei-Cheng Huang, Chun-Fu Chen, Hsiang Hsu
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 5 EXPERIMENTAL RESULTS |
| Researcher Affiliation | Industry | Wei-Cheng Huang, Chun-Fu (Richard) Chen, Hsiang Hsu Global Technology Applied Research, JPMorgan Chase, USA {wei-cheng.huang,richard.cf.chen,hsiang.hsu}@jpmchase.com |
| Pseudocode | Yes | Algorithm 1 OVOR: One Prompt with Virtual Outlier Regularization |
| Open Source Code | Yes | Our source code is available at https://github.com/jpmorganchase/ovor. |
| Open Datasets | Yes | Datasets. For evaluation, we use four widely-used datasets, Image Net-R (Hendrycks et al., 2021a) and CIFAR-100 (Krizhevsky et al., 2009), Image Net-A (Hendrycks et al., 2021b), CUB-200 (Wah et al., 2011). |
| Dataset Splits | No | The paper describes how datasets are split into tasks with training and test sets (Dt train, Dt test) for class-incremental learning, but does not explicitly mention a separate validation dataset split used for model selection or hyperparameter tuning. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU or CPU models used for running experiments. |
| Software Dependencies | No | The paper mentions software components like 'Adam optimizer' and 'timm library' but does not specify their version numbers or other software dependencies with specific versions. |
| Experiment Setup | Yes | Training Details. For experimental configurations of CIL training, we follow Smith et al. (2023a) as much as possible, using the Adam (Kingma & Ba, 2015) optimizer, and batch size of 128. The model is trained 50 epochs per task on Image Net-R and 20 for other three datasets. ... For the regularization loss, we set λ to 0.1, same as Liu et al. (2020). We use τCurrent = 24.0 and τOutlier = 3.0 for the two threshold hyperparameters in the loss term, and δ = 1.0 as in the Huber loss Lδ, which is a component of the regularization loss. ... We use 1.0, 10, 160, 100 for σ, α, β, and K, respectively. These are the configuration for Image Net-R, for detailed hyperparameters settings for all datasets, please refer to Table C.17 and Table C.18 in the appendix. |