A Statistical Theory of Regularization-Based Continual Learning

Authors: Xuyang Zhao, Huiyuan Wang, Weiran Huang, Wei Lin

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct simulation experiments to illustrate the performance of continual ridge regression (CRR), the minimum norm estimator (MN), and the generalized ℓ2-regularized estimator (GR). Simulation results. The simulation results for different noise levels are depicted in Figure 1.
Researcher Affiliation Academia 1School of Mathematical Sciences and Center for Statistical Science, Peking University, Beijing, China 2Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA 3MIFA Lab, Qing Yuan Research Institute, SEIEE, Shanghai Jiao Tong University, Shanghai, China 4Shanghai AI Laboratory, Shanghai, China.
Pseudocode Yes Algorithm 1 Generalized ℓ2-regularization method, Algorithm 2 Minimum norm estimator, Algorithm 3 Continual ridge regression, Algorithm 4 Early stopping estimator
Open Source Code No The paper does not contain an explicit statement or link indicating that the source code for the methodology described is publicly available or will be released.
Open Datasets No The paper describes generating synthetic data for its simulations (e.g., 'The true parameter w is sampled from N(0, Ip)'). It does not use or provide access information for a pre-existing publicly available dataset.
Dataset Splits No The paper describes a continual learning setup with sequentially arriving tasks and reports estimation errors for each task. It does not mention conventional train/validation/test dataset splits, as it uses synthetically generated data for each task rather than pre-split datasets.
Hardware Specification No The paper does not provide specific details about the hardware used to run the experiments, such as CPU models, GPU models, or memory specifications.
Software Dependencies No The paper does not specify the versions of any software dependencies (e.g., programming languages, libraries, or frameworks) used for the experiments.
Experiment Setup Yes We set the task number T = 20 and sample size n1 = = nt = 150. The parameter dimension p = 200, and hence each single task is overparameterized. We consider two noise levels: σ2 = 1 or 5. We repeated our experiments 100 times and present the average results.