DualNet: Continual Learning, Fast and Slow
Authors: Quang Pham, Chenghao Liu, Steven Hoi
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our extensive experiments on two challenging continual learning benchmarks of CORE50 and mini Image Net show that Dual Net outperforms state-of-the-art continual learning methods by a large margin. We further conduct ablation studies of different SSL objectives to validate Dual Net s efficacy, robustness, and scalability. |
| Researcher Affiliation | Collaboration | Quang Pham1, Chenghao Liu2, Steven C.H. Hoi 1,2 1 Singapore Management University hqpham.2017@smu.edu.sg 2 Salesforce Research Asia {chenghao.liu, shoi}@salesforce.com |
| Pseudocode | Yes | Due to space constraints, we refer to the supplementary materials for Dual Net s pseudo-code, additional results, experiments settings such as dataset summary, evaluation metrics, hyper-parameter configurations, and further discussions. |
| Open Source Code | Yes | Code is publicly available at https://github.com/phquang/Dual Net. |
| Open Datasets | Yes | Benchmarks We consider the Split continual learning benchmarks constructed from the mini Image Net [54] and CORE50 dataset [35] with three validation tasks and 17, 10 continual learning tasks, respectively. |
| Dataset Splits | Yes | We consider the Split continual learning benchmarks constructed from the mini Image Net [54] and CORE50 dataset [35] with three validation tasks and 17, 10 continual learning tasks, respectively. ... For all methods, the hyper-parameters are selected by performing grid-search on the cross-validation tasks. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts, or detailed computer specifications) used for running its experiments within the main text. It refers to the Appendix for compute details, which is not provided in this context. |
| Software Dependencies | No | The paper mentions optimizers (SGD, Look-ahead) and specific SSL methods (Barlow Twins, Sim CLR, Sim Siam, BYOL) and a backbone (Res Net18) by name, but it does not specify version numbers for these software components or any underlying programming languages or libraries (e.g., Python, PyTorch, CUDA). |
| Experiment Setup | Yes | In the supervised learning phase, all methods are optimized by the (SGD) optimizer over one epoch with mini-batch size 10 and 32 on the Split mini Image Net and CORE50 benchmarks respectively [43]. In the representation learning phase, we use the Look-ahead optimizer [61] to train the Dual Net s slow learner as described in Section 2.3. We employ an episodic memory with 50 samples per task and the Ring-buffer management strategy [36] in the task-aware setting. In the taskfree setting, the memory is implemented as a reservoir buffer [55] with 100 samples per class. We simulate the synchronous training property in Dual Net by training the slow learner with n iterations using the episodic memory data before observing a mini-batch of labeled data. Dual Net s slow learner optimizes the Barlow Twins objective for n = 3 iterations between every incoming mini-batch of labeled data. |