Revisiting Neural Networks for Continual Learning: An Architectural Perspective
Authors: Aojun Lu, Tao Feng, Hangjie Yuan, Xiaotian Song, Yanan Sun
IJCAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental validation across various CL settings and scenarios demonstrates that improved architectures are parameter-efficient, achieving state-of-the-art performance of CL while being 86%, 61%, and 97% more compact in terms of parameters than the naive CL architecture in Task IL and Class IL. |
| Researcher Affiliation | Academia | Aojun Lu1 , Tao Feng2 , Hangjie Yuan3 , Xiaotian Song1 and Yanan Sun1 1Sichuan University 2Tsinghua University 3Zhejiang University |
| Pseudocode | No | The paper describes methods and strategies in text but does not include any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code is available at https://github.com/byyx666/Arch Craft. |
| Open Datasets | Yes | Benchmark. For the CL scenarios mentioned above, i.e., Task IL and Class IL, we assess network performance on CIFAR-100. Benchmark We choose CIFAR-100 and Imagenet-100 to evaluate the Arch Craft-guided architectures. |
| Dataset Splits | No | The paper mentions training and evaluation on a 'test set' but does not explicitly provide details for a separate validation split, such as percentages, counts, or a specific strategy for creating one. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for experiments, such as GPU models, CPU types, or memory. |
| Software Dependencies | No | The paper mentions following 'Py CIL [Zhou et al., 2023]' for training in Class IL, but it does not provide specific version numbers for Py CIL or any other software dependencies. |
| Experiment Setup | Yes | Implementation Details. For Task IL, we train the model by 60 epochs in the first task and 20 epochs in the subsequent tasks. For Class IL, we follow Py CIL [Zhou et al., 2023] to train the model by 200 epochs in the first task and 70 epochs in the subsequent tasks. In Task IL, the network is trained using a vanilla SGD optimizer, while in Class IL, a replay buffer containing 2,000 examples is employed. |