Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

From Dormant to Deleted: Tamper-Resistant Unlearning Through Weight-Space Regularization

Authors: Shoaib Ahmed Siddiqui, Adrian Weller, David Krueger, Gintare Karolina Dziugaite, Michael C. Mozer, Eleni Triantafillou

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate a range of increasingly-complex unlearning algorithms in this setting and discover a surprising finding: for numerous unlearning algorithms, the accuracy of the forget set jumps from around 50% post-unlearning to nearly 100% after fine-tuning the unlearned models on only the retain set, which is disjoint from the forget set. Fig. 1 shows this phenomenon on CIFAR-10 using Res Net-18, after having attempted to unlearn a subset of atypical instances of class airplane .
Researcher Affiliation Collaboration Shoaib Ahmed Siddiqui University of Cambridge Adrian Weller University of Cambridge The Alan Turing Institute David Krueger Mila Gintare Karolina Dziugaite Google Deep Mind Mila Michael C. Mozer Google Deep Mind Eleni Triantafillou Google Deep Mind Correspondence to EMAIL.
Pseudocode No The paper describes methods and procedures in prose, but it does not contain any clearly labeled 'Pseudocode' or 'Algorithm' blocks, nor does it present structured steps in a code-like format outside of paragraph text.
Open Source Code Yes Code to reproduce our experiments: https://github.com/shoaibahmed/vision_relearning
Open Datasets Yes We use two different models for our evaluation from the Res Net model family [17], namely Res Net-18 and Res Net-34 [17]. In terms of datasets, we use CIFAR-10 and CIFAR-100 datasets [25] with 10 and 100 classes respectively, and a total of 50k training instances in each case (5k instances per class for CIFAR-10, and 500 instances per class for CIFAR-100). ...We evaluate on the higher-resolution Image Nette dataset [19], which is a subset of the Image Net dataset [35].
Dataset Splits Yes We pretrain the model for 300 epochs using Adam optimizer [24] with a learning rate of 1e-4, cosine learning rate decay with a decay factor of 0.1, batch size of 128, and a weight decay of 1e-4 in all configurations. Unlearning. We consider two unlearning settings: sub-class unlearning, where the forget set consists of 10% of the class instances (here, sub-class means a subset of the complete class), and class-agnostic unlearning, where we select 1% of the data set regardless of class labels. This ensures that we use the same number of examples in the forget set for both settings on CIFAR-10 (we only evaluate sub-class unlearning on CIFAR-100).
Hardware Specification Yes We used NVIDIA RTX 3090 for each of our experiments, with the GPU equipped with 24GB of high-bandwidth memory (HBM) we only use a tiny fraction of it as we train small Res Net models on CIFAR-10/100.
Software Dependencies No The paper mentions using 'Adam optimizer [24]' and 'Res Net model family [17]', but it does not specify versions for any programming languages (e.g., Python), machine learning frameworks (e.g., PyTorch, TensorFlow), or other libraries used in the implementation.
Experiment Setup Yes Pretraining. We pretrain the model for 300 epochs using Adam optimizer [24] with a learning rate of 1e-4, cosine learning rate decay with a decay factor of 0.1, batch size of 128, and a weight decay of 1e-4 in all configurations. Unlearning. We use a smaller learning rate of 1e-5 without any weight decay and optimize the model for a 100 epochs during the unlearning phase. Relearning attack. During this phase, we fine-tune on a combination of the retain set DR and a subset of the forget set for relearning (DFre). We explore the impact of different choices for relearning examples in Appendix G. We again use a small learning rate of 1e-5 without any weight decay, and optimize the model for just 10 epochs (except Fig. 1 where we optimized the model for 300 epochs). Similar to the pretraining stage, we use a cosine learning rate decay with a decay factor of 0.1.