Understanding Forgetting in Continual Learning with Linear Regression

Authors: Meng Ding, Kaiyi Ji, Di Wang, Jinhui Xu

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To validate our theoretical analysis, we conducted simulation experiments on both linear regression models and Deep Neural Networks (DNNs). Results from these simulations substantiate our theoretical findings.
Researcher Affiliation Collaboration 1Department of Computer Science and Engineering, State University of New York at Buffalo, Buffalo, USA 2Division of CEMSE, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia.
Pseudocode No The paper describes the SGD update rule in mathematical equations but does not present it in a pseudocode block or algorithm format.
Open Source Code No The paper does not provide any statement or link indicating that source code for their methodology is openly available.
Open Datasets No In our study, we designed three distinct tasks, denoted as Tasks 1,2, and 3, each with a different feature space. To mimic real-world data imperfections, Gaussian noise with a standard deviation of 0.1 was added to the labels. This indicates synthetic data generation without providing access to a publicly available dataset.
Dataset Splits No The paper mentions training and testing but does not explicitly describe specific training/test/validation dataset splits (e.g., percentages or sample counts) needed for reproduction beyond stating 'various data sizes' and evaluation 'on each task'.
Hardware Specification No The paper does not specify any particular hardware (e.g., CPU, GPU models, or cloud computing instances) used for running the experiments.
Software Dependencies No The paper mentions using 'Stochastic Gradient Descent (SGD)' and 'Deep Neural Networks (DNNs)' but does not specify any particular software libraries, frameworks, or their version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes For this experiment, a linear regression model was trained using Stochastic Gradient Descent (SGD) with a learning rate of 0.01 or 0.001. Each task sequence underwent training with various data sizes, ranging from 100 to 950 in increments of 50, and each task was trained for five epochs.