Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Towards Unbounded Machine Unlearning
Authors: Meghdad Kurmanji, Peter Triantafillou, Jamie Hayes, Eleni Triantafillou
NeurIPS 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The above are substantiated through a comprehensive empirical evaluation against previous state-of-the-art. |
| Researcher Affiliation | Collaboration | Meghdad Kurmanji University of Warwick Peter Triantafillou University of Warwick Jamie Hayes Google Deep Mind Eleni Triantafillou Google Deep Mind |
| Pseudocode | Yes | We provide pseudocode, training plots and ablations in the Appendix. (Section 3.1) and Algorithm 1 SCRUB in Section 9. |
| Open Source Code | Yes | Our code is available for reproducibility 2. https://github.com/Meghdad92/SCRUB |
| Open Datasets | Yes | We utilize the same two datasets from previous work: CIFAR-10 [Krizhevsky et al., 2009] and Lacuna-10 [Golatkar et al., 2020a], which is derived from VGG-Faces [Cao and Yang, 2015] |
| Dataset Splits | Yes | For the small-scale, we exactly follow the setup in [Golatkar et al., 2020b] that uses only 5 classes from each of CIFAR and Lacuna ( CIFAR-5 / Lacuna-5 ), with 100 train, 25 validation and 100 test examples per class. and In our experiments, the train, test, and validation sizes are 40000, 10000, and 10000 respectively. |
| Hardware Specification | Yes | For scale-up experiments, the code is executed in Python 3.8, on an Ubuntu 20 machine with 40 CPU cores, a Nvidia GTX 2080 GPU and 256GB memory. |
| Software Dependencies | No | For scale-up experiments, the code is executed in Python 3.8, on an Ubuntu 20 machine with 40 CPU cores, a Nvidia GTX 2080 GPU and 256GB memory. (Mentions Python version, but no other key libraries with versions.) |
| Experiment Setup | Yes | For all experiments, we initialize the learning rate at 0.0005 and decay it by 0.1 after a number of min and max steps. [...] We apply a weight decay of 0.1 for small-scale setting and 0.0005 for large scale experiments, with a momentum of 0.9. Finally, we use different batch sizes for the forget-set and the retain-set to control the number of iteration in each direction, i.e the max and the min respectively. We report these in Table 3. |