Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Towards Understanding Catastrophic Forgetting in Two-layer Convolutional Neural Networks

Authors: Boqi Li, Youjun Wang, Weiwei Liu

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The findings are supported by empirical results from both simulated and real-world datasets. We perform numerical analysis and conduct experiments on real-world datasets to further validate our theoretical findings. In this section, we conduct experiments on both simulated and real-world datasets to validate our findings.
Researcher Affiliation	Academia	1School of Computer Science, Wuhan University National Engineering Research Center for Multimedia Software, Wuhan University Institute of Artificial Intelligence, Wuhan University Hubei Key Laboratory of Multimedia and Network Communication Engineering, Wuhan University. Correspondence to: Weiwei Liu <EMAIL>.
Pseudocode	No	The paper describes the model architecture and training process using mathematical equations and textual descriptions, but no explicit pseudocode or algorithm blocks are present.
Open Source Code	No	The paper does not contain any explicit statements about releasing source code or provide links to code repositories.
Open Datasets	Yes	We conduct experiments on CIFAR-10, CIFAR-100 (Krizhevsky et al., 2009), and Tiny-Image Net (Deng et al., 2009) to evaluate the model’s performance on real-world datasets.
Dataset Splits	No	For each dataset, we split the data into K/2 binary tasks, where K is the number of classes. We then sequentially select a pair of tasks, and train the model on the pair of tasks. This describes how tasks are formed, but not the train/test/validation splits for the datasets themselves in terms of percentages or counts.
Hardware Specification	No	The paper does not provide specific details about the hardware used for running the experiments.
Software Dependencies	No	For simulated dataset, we use Py Torch as the deep learning framework for training. This mentions PyTorch but does not specify a version number.
Experiment Setup	Yes	We utilize the logistic loss ℓ(F(x), y) = log 1 + e y F (x) as the loss function and apply gradient descent (GD) to optimize the parameters. Throughout the learning process, the last layer remains fixed. Given a learning rate η, at round t of stage τ {1, 2}, the parameters of the network are updated as follows:... For simulated dataset, we use Py Torch as the deep learning framework for training. The CNN model used is defined as Equation (1) with C = 10, and the model is optimized using GD. The learning rate is set to 0.1. In each stage, the model is trained for 100 epochs. The hyperparameters are set as follows: P = 1, σξ = 0.5d, and d = 50. The value of αu is fixed at 1, while αv and αζ are varied across discrete values. In real-world datasets, we use Res Net-18 as the CNN model, which is trained for 100 epochs in both the first and second stage.