The Ideal Continual Learner: An Agent That Never Forgets

Authors: Liangzu Peng, Paris Giampouras, Rene Vidal

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical In this paper, we focus on theoretically understanding continual learning and catastrophic forgetting by trying to answer Questions (Q1) and (Q2). In particular: We propose a general framework for continual learning, called the Ideal Continual Learner (ICL), and we show that, under mild assumptions, ICL never forgets. This characterization of never forgetting makes it possible to address Questions (Q1) and (Q2) via dissecting the optimization and generalization properties of ICL. We also derive generalization bounds for ICL which allow us to theoretically quantify how rehearsal affects generalization.
Researcher Affiliation Academia 1Mathematical Institute for Data Science, Johns Hopkins University, Baltimore, USA 2Innovation in Data Engineering and Science (IDEAS), University of Pennsylvania, Philadelphia, USA 3NORCE Norwegian Research Centre, Norway.
Pseudocode No The paper describes algorithms and implementations (e.g., 'ICL for continual linear regression can be implemented as follows.') but does not include a formally labeled 'Pseudocode' or 'Algorithm' block.
Open Source Code No The paper does not provide any statement about releasing open-source code or a link to a code repository for the methodology described.
Open Datasets No The paper uses theoretical problem setups like 'continual linear regression' and 'continual matrix factorization' but does not specify or provide access to any named public datasets for its own analysis or examples.
Dataset Splits No The paper does not perform empirical experiments requiring dataset splits; hence, it does not provide specific training, validation, or test dataset split information.
Hardware Specification No The paper is theoretical and does not report on computational experiments, therefore, no hardware specifications are mentioned.
Software Dependencies No The paper is theoretical and does not report on computational experiments that would require listing specific software dependencies with version numbers.
Experiment Setup No The paper is theoretical and does not describe an experimental setup with hyperparameters or training settings, as it does not report on empirical experiments.