Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Predicting the Susceptibility of Examples to Catastrophic Forgetting
Authors: Guy Hacohen, Tinne Tuytelaars
ICML 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our key observation is a last-in-first-out forgetting pattern: examples learned later are more prone to forgetting, while earlier-learned ones are preserved. This aligns with the simplicity bias of neural networks (Shah et al., 2020; Szegedy et al., 2014), where simpler examples are learned first. As a result, simple examples are consistently remembered, while more complex ones are forgotten as new data arrives. This pattern holds across a wide range of architectures, datasets, and training configurations including variations in learning rates, optimizers, schedulers, epochs, and regularization strategies (see 2, App. D). Fig. 1 visualizes remembered vs. forgotten examples in CIFAR-100. |
| Researcher Affiliation | Academia | 1ESAT-PSI, KU Leuven, Belgium. Correspondence to: Guy Hacohen <EMAIL>, Tinne Tuytelaars <EMAIL>. |
| Pseudocode | Yes | Algorithm 1 Training CL method with SBS. Input: Dt, |B|, E, amount of quick/slow to remove q, s. Output: buffer of size |B| |
| Open Source Code | No | The paper does not contain an explicit statement or link indicating that the source code for the described methodology is publicly available. Phrases like 'We release our code...' or a direct repository link are missing. |
| Open Datasets | Yes | Datasets. We investigated various image continual learning classification tasks using split versions of several image datasets, including CIFAR-10, CIFAR-100 (Krizhevsky et al., 2009), and Tiny Image Net (Le & Yang, 2015). |
| Dataset Splits | Yes | The data is split into T tasks by partitioning the classes into T equal-sized subsets. This partitioning is denoted as dataset T. For example, splitting CIFAR-10 into 5 classes is denoted as CIFAR-10-5, comprising 5 tasks, each with 2 distinct classes. |
| Hardware Specification | Yes | All networks were trained on NVIDIA TITAN X. |
| Software Dependencies | No | The paper mentions software components and frameworks like "Res Net-18", "SGD optimizer", "cosine scheduler", "experience replay strategy", and "framework of (Boschini et al., 2022; Buzzega et al., 2020)". However, it does not provide specific version numbers for these or other ancillary software dependencies (e.g., Python, PyTorch/TensorFlow versions). |
| Experiment Setup | Yes | In our experiments, unless stated otherwise, we trained Res Net-18 for E = 100 epochs per task. We employed a base learning rate of 0.1 with a cosine scheduler, SGD optimizer, momentum of 0.9, and weight decay of 0.0005. |