Overcoming Catastrophic Forgetting by Incremental Moment Matching
Authors: Sang-Woo Lee, Jin-Hwa Kim, Jaehyun Jun, Jung-Woo Ha, Byoung-Tak Zhang
NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The experimental results show that IMM achieves state-of-the-art performance by balancing the information between an old and a new network. |
| Researcher Affiliation | Collaboration | Sang-Woo Lee1, Jin-Hwa Kim1, Jaehyun Jun1, Jung-Woo Ha2, and Byoung-Tak Zhang1,3 Seoul National University1 Clova AI Research, NAVER Corp2 Surromind Robotics3 |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | The source code for the experiments is available in Github repository1. 1https://github.com/btjhjeon/IMM_tensorflow |
| Open Datasets | Yes | We analyze our approach on a variety of datasets including the MNIST, CIFAR-10, Caltech-UCSDBirds, and Lifelog datasets. |
| Dataset Splits | No | Consider that tuned hyperparameter setting is often used in previous works of continual learning as it is difficult to define a validation set in their setting. |
| Hardware Specification | No | The paper does not provide specific hardware details used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details with version numbers. |
| Experiment Setup | Yes | Hyperparam denotes the main hyperparameter of each algorithm. For IMM with transfer, only α is tuned. The numbers in the parentheses refer to standard deviation. Every IMM uses weight-transfer. Table 1 lists specific hyperparameter values like 'λ in (10)', 'p in (11)', 'α2 in (4)', 'α2 in (7)' for both untuned and tuned settings. |