Compacting, Picking and Growing for Unforgetting Continual Learning
Authors: Ching-Yi Hung, Cheng-Hao Tu, Cheng-En Wu, Chien-Hung Chen, Yi-Ming Chan, Chu-Song Chen
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results show that our approach can incrementally learn a deep model tackling multiple tasks without forgetting, while the model compactness is maintained with the performance more satisfiable than individual task training. |
| Researcher Affiliation | Academia | Steven C. Y. Hung, Cheng-Hao Tu, Cheng-En Wu, Chien-Hung Chen, Yi-Ming Chan, and Chu-Song Chen Institute of Information Science, Academia Sinica, Taipei, Taiwan MOST Joint Research Center for AI Technology and All Vista Healthcare {brent12052003, andytu455176}@gmail.com, {chengen, redsword26, yiming, song}@iis.sinica.edu.tw |
| Pseudocode | Yes | Algorithm 1: Compacting, Picking and Growing Continual Learning Input: given task 1 and an original model trained on task 1. Set an accuracy goal for task 1; Alternatively remove small weights and re-train the remaining weights for task 1 via gradual pruning [51], whenever the accuracy goal is still hold; Let the model weights preserved for task 1 be WP 1 (referred to as task-1 weights), and those that are removed by the iterative pruning be WE 1 (referred to as the released weights); for task k = 2 K (let the released weights of task k be W E k ) do Set an accuracy goal for task k; Apply a mask M to the weights WP 1:k 1; train both M and WE k 1 for task k, with WP 1:k 1 fixed; If the accuracy goal is not achieved, expand the number of filters (weights) in the model, reset WE k 1 and go to previous step; Gradually prune WE k 1 to obtain WE k (with WP 1:k 1 fixed) for task k, until meeting the accuracy goal; WP k = WE k 1\WE k and WP 1:k = WP 1:k 1 WP k ; end |
| Open Source Code | Yes | Our codes are available at https://github.com/ivclab/CPG. |
| Open Datasets | Yes | We divide the CIFAR-100 dataset into 20 tasks. Each task has 5 classes, 2500 training images, and 500 testing images. In the experiment, VGG16-BN model (VGG16 with batch normalization layers) is employed to train the 20 tasks sequentially. |
| Dataset Splits | No | Section 4.1 mentions '2500 training images, and 500 testing images' per task for CIFAR-100, and Table 4 shows '#Train' and '#Eval' counts for other datasets. However, the paper does not explicitly specify how validation sets are created or used distinct from training/testing splits, nor does it specify exact percentages or explicit validation split methodologies. |
| Hardware Specification | No | The paper does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for running experiments. |
| Software Dependencies | No | We implement our CPG approach1 and independent task learning (from scratch or fine-tuning) via Py Torch [30] in all experiments, but implement DEN [27] via Tensorflow [1] with its official codes. However, specific version numbers for PyTorch or TensorFlow are not provided. |
| Experiment Setup | No | The paper specifies the models used (VGG16-BN, ResNet50, 20-layer CNN from SphereFace) and mentions procedural settings like gradual pruning with an accuracy goal, but it does not provide specific numerical hyperparameters such as learning rates, batch sizes, optimizers, or epochs. |