Continual Learning by Using Information of Each Class Holistically
Authors: Wenpeng Hu, Qi Qin, Mengyu Wang, Jinwen Ma, Bing Liu7797-7805
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical evaluation shows that PCL markedly outperforms the state-of-the-art baselines for one or more classes per task. |
| Researcher Affiliation | Academia | 1 Department of Information Science, School of Mathematical Sciences, Peking University 2 Center for Data Science, AAIS, Peking University 3 Wangxuan Institute of Computer Technology, Peking University |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | We now evaluate the proposed PCL technique (the code can be found here3) and compare it with both classic and the latest baselines" and Footnote 3 "https://github.com/morning-dews/PCL" |
| Open Datasets | Yes | We use four benchmark image classification datasets and two text classification datasets in our experiments: MNIST (Le Cun, Cortes, and Burges 1998), EMNIST-47 (Cohen et al. 2017), CIFAR10 and CIFAR100 (Krizhevsky and Hinton 2009) for images; 20news and DBPedia for text. |
| Dataset Splits | Yes | We randomly select 10% of the examples from the training set of each dataset as the validation set to tune the hyper-parameters. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper mentions using SGD as an optimizer and various baselines' code, but it does not provide specific version numbers for ancillary software components or libraries required for reproduction. |
| Experiment Setup | Yes | For training, we use SGD with moment as the optimizer (learning rate = 0.1). We run each experiment five times. For each run of PCL or a baseline, we execute 500 epochs and use the maximum accuracy as the final result of the run. [...] PCL has 3 parameters that need tuning: λ and n in H-reg (Sec. 3.1) and η for transfer (Sec. 3.2). [...] After tuning, we get the best hyperparameters of λ = 0.5 and n = 12. For η, different data have different values, 0.001 for MNIST and EMNIST-47, 0.005 for CIFAR10 and DBPedia, 0.01 for CIFAR100 and 20news. |