A Statistical Theory of Regularization-Based Continual Learning
Authors: Xuyang Zhao, Huiyuan Wang, Weiran Huang, Wei Lin
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct simulation experiments to illustrate the performance of continual ridge regression (CRR), the minimum norm estimator (MN), and the generalized ℓ2-regularized estimator (GR). Simulation results. The simulation results for different noise levels are depicted in Figure 1. |
| Researcher Affiliation | Academia | 1School of Mathematical Sciences and Center for Statistical Science, Peking University, Beijing, China 2Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA 3MIFA Lab, Qing Yuan Research Institute, SEIEE, Shanghai Jiao Tong University, Shanghai, China 4Shanghai AI Laboratory, Shanghai, China. |
| Pseudocode | Yes | Algorithm 1 Generalized ℓ2-regularization method, Algorithm 2 Minimum norm estimator, Algorithm 3 Continual ridge regression, Algorithm 4 Early stopping estimator |
| Open Source Code | No | The paper does not contain an explicit statement or link indicating that the source code for the methodology described is publicly available or will be released. |
| Open Datasets | No | The paper describes generating synthetic data for its simulations (e.g., 'The true parameter w is sampled from N(0, Ip)'). It does not use or provide access information for a pre-existing publicly available dataset. |
| Dataset Splits | No | The paper describes a continual learning setup with sequentially arriving tasks and reports estimation errors for each task. It does not mention conventional train/validation/test dataset splits, as it uses synthetically generated data for each task rather than pre-split datasets. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used to run the experiments, such as CPU models, GPU models, or memory specifications. |
| Software Dependencies | No | The paper does not specify the versions of any software dependencies (e.g., programming languages, libraries, or frameworks) used for the experiments. |
| Experiment Setup | Yes | We set the task number T = 20 and sample size n1 = = nt = 150. The parameter dimension p = 200, and hence each single task is overparameterized. We consider two noise levels: σ2 = 1 or 5. We repeated our experiments 100 times and present the average results. |