Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Simplicity Prevails: Rethinking Negative Preference Optimization for LLM Unlearning
Authors: Chongyu Fan, Jiancheng Liu, Licong Lin, Jinghan Jia, Ruiqi Zhang, Song Mei, Sijia Liu
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments further validate its efficacy on benchmarks like TOFU, MUSE and WMDP. Codes are available at https://github.com/OPTML-Group/Unlearn-Simple. |
| Researcher Affiliation | Collaboration | Chongyu Fan , Jiancheng Liu , Licong Lin , Jinghan Jia Ruiqi Zhang Song Mei Sijia Liu , Michigan State University University of California, Berkeley IBM Research Equal contributions |
| Pseudocode | No | The paper describes methods using mathematical formulations (e.g., equations 1, 2, 3, 4, 5) and textual explanations, but it does not contain any explicitly labeled 'Pseudocode' or 'Algorithm' blocks. |
| Open Source Code | Yes | Codes are available at https://github.com/OPTML-Group/Unlearn-Simple. |
| Open Datasets | Yes | Extensive experiments further validate its efficacy on benchmarks like TOFU, MUSE and WMDP. [...] (1) TOFU [18] considers fictitious unlearning on a synthetic Q&A dataset. (2) MUSE [4] is designed to remove verbatim or knowledge memorization from News and Books datasets, including both verbatim texts and knowledge sets for unlearning evaluation. (3) WMDP [3] aims to prevent LLMs from generating hazardous content in domains such as biology, cybersecurity, and chemistry. |
| Dataset Splits | Yes | We generate 10000 samples from the retain distribution and 5000 each from Forget1 and Forget2 to form the retain and forget sets. We randomly split the datasets, using 80% of the samples for training and unlearning, and the remaining 20% for testing. [...] We select 20% of the original TOFU forget05 set as the relearning set over three epochs. |
| Hardware Specification | Yes | All experiments are conducted on 8 NVIDIA A6000 GPU cards in a single node. |
| Software Dependencies | No | The paper mentions using a 'small GPT-2 model [66]' and 'Adam W [67]' for optimization, but it does not specify software dependencies with version numbers (e.g., Python, PyTorch, CUDA versions). |
| Experiment Setup | Yes | For all experiments, we use a linear warm-up learning rate during the first epoch, followed by a linearly decaying learning rate in the remaining epochs. We initialize the process with LLa MA-2 7B and fine-tune the model on TOFU for 5 epochs with a batch size of 32 and a learning rate of 10-5 to obtain the original model. For Forget05, NPO is trained for up to 20 epochs with a learning rate of 10-5. We conducted a grid search for β in the range of [0.05, 0.2] and for λ in the range of [0.5, 1.5]. Sim NPO is trained for 10 epochs with a learning rate of 10-5. The parameter β is grid-searched over the range [1.5, 3.5], γ is searched between [0.0, 2.0] with the default choice γ = 0, and λ is explored within the range [0.05, 0.25]. |