Nearly Optimal Bounds for Cyclic Forgetting

Authors: William Swartworth, Deanna Needell, Rachel Ward, Mark Kong, Halyun Jeong

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We provide theoretical bounds on the forgetting quantity in the continual learning setting for linear tasks, where each round of learning corresponds to projecting onto a linear subspace. For a cyclic task ordering on T tasks repeated m times each, we prove the best known upper bound of O(T 2/m) on the forgetting. Notably, our bound holds uniformly over all choices of tasks and is independent of the ambient dimension. Our main technical contribution is a characterization of the union of all numerical ranges of products of T (real or complex) projections as a sinusoidal spiral, which may be of independent interest.
Researcher Affiliation Academia Halyun Jeong University of California Los Angeles hajeong@math.ucla.edu Mark Kong University of California Los Angeles markkong@ucla.edu Deanna Needell University of California Los Angeles deanna@math.ucla.edu William Swartworth Carnegie Mellon University wswartwo@andrew.cmu.edu Rachel Ward University of Texas at Austin rward@math.utexas.edu
Pseudocode No The paper describes algorithms (e.g., Kaczmarz update) in prose but does not provide them in structured pseudocode or an algorithm block.
Open Source Code No The paper does not contain any statement about releasing source code or a link to a code repository.
Open Datasets No The paper works in a theoretical setting with abstract "tasks" and "data points"; it does not use or refer to any specific publicly available datasets for training.
Dataset Splits No This is a theoretical paper and does not conduct empirical experiments, thus it does not mention training, validation, or test dataset splits.
Hardware Specification No This is a theoretical paper and does not describe any computational experiments, thus it does not mention hardware specifications.
Software Dependencies No This is a theoretical paper and does not describe any computational experiments, thus it does not mention specific software dependencies with version numbers.
Experiment Setup No This is a theoretical paper and does not describe any experimental setup with hyperparameters or training settings.