Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

When to Forget? Complexity Trade-offs in Machine Unlearning

Authors: Martin Van Waerebeke, Marco Lorenzi, Giovanni Neglia, Kevin Scaman

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	5. Experiments: We investigate the global landscape of the unlearning complexity ratio as a function of key factors, including the accepted excess risk threshold e and the unlearning budget (ϵ, δ), which are jointly quantified by the constant κϵ,δ. 5.1. Experimental Setting: The goal of the experiment section is to validate the theoretical analysis presented in Section 4 by comparing the performance of unlearning and retraining on both real and synthetic functions and datasets. 5.2. Experiments on Synthetic Data. 5.3. Experiments on Real Data. 5.4. Experimental Results: Figure 2 illustrates the empirical unlearning complexity ratio.
Researcher Affiliation	Academia	1INRIA Paris 2INRIA Sophia Antipolis. Correspondence to: Martin Van Waerebeke <EMAIL>.
Pseudocode	Yes	Algorithm 1 Iterative (Un)Learning Algorithm Algorithm 2 Noise and Fine-Tune Unlearning Algorithm
Open Source Code	No	The paper does not contain any explicit statements about releasing code or links to source code repositories for the methodology described.
Open Datasets	Yes	5.1. Experimental Setting: We aim to learn linear regression models in Rd (with even d). We perform experiments both on synthetic worst-case functions, as analysed in our theory, and on the Digit dataset of handwritten digits, which is a subset of the larger dataset proposed in Alpaydin and Alimoglu (1996). Alpaydin, E. and Alimoglu, F. (1996). Pen-Based Recognition of Handwritten Digits. UCI Machine Learning Repository. DOI: https://doi.org/10.24432/C5MG6K.
Dataset Splits	Yes	In every experiment, the retain and forget are obtained through the random splitting of the dataset into two parts of respective sizes n rf n and rf n . We consider rf = 10 2.
Hardware Specification	No	The paper mentions experiments being performed but does not provide specific details about the hardware used (e.g., GPU models, CPU models, memory specifications).
Software Dependencies	No	The paper mentions using 'stochastic gradient descent' and 'the standard SGD optimizer' but does not provide specific version numbers for any software libraries, frameworks, or programming languages.
Experiment Setup	Yes	5.3. Experiments on Real Data: For the real data, the experimental process is simpler as we optimize a standard cross-entropy loss with L2 regularization. For various values of κϵ,δ and e, we measure T U e and T S e in a more realistic machine learning setting, with decaying learning rate, batch size of 64, and averaging the results over 50 runs. The learning rate is initialised at 10 2 and multiplied by 0.6 every 1000 epoch. Each experiment is repeated 50 times and the results are then averaged. We used a batch-size of 64 and trained until the threshold e was reached, for every chosen value of κϵ,δ. The optimized used is the standard SGD optimizer without acceleration.