Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Simplicity Prevails: Rethinking Negative Preference Optimization for LLM Unlearning

Authors: Chongyu Fan, Jiancheng Liu, Licong Lin, Jinghan Jia, Ruiqi Zhang, Song Mei, Sijia Liu

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments further validate its efficacy on benchmarks like TOFU, MUSE and WMDP. Codes are available at https://github.com/OPTML-Group/Unlearn-Simple.
Researcher Affiliation Collaboration Chongyu Fan , Jiancheng Liu , Licong Lin , Jinghan Jia Ruiqi Zhang Song Mei Sijia Liu , Michigan State University University of California, Berkeley IBM Research Equal contributions
Pseudocode No The paper describes methods using mathematical formulations (e.g., equations 1, 2, 3, 4, 5) and textual explanations, but it does not contain any explicitly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code Yes Codes are available at https://github.com/OPTML-Group/Unlearn-Simple.
Open Datasets Yes Extensive experiments further validate its efficacy on benchmarks like TOFU, MUSE and WMDP. [...] (1) TOFU [18] considers fictitious unlearning on a synthetic Q&A dataset. (2) MUSE [4] is designed to remove verbatim or knowledge memorization from News and Books datasets, including both verbatim texts and knowledge sets for unlearning evaluation. (3) WMDP [3] aims to prevent LLMs from generating hazardous content in domains such as biology, cybersecurity, and chemistry.
Dataset Splits Yes We generate 10000 samples from the retain distribution and 5000 each from Forget1 and Forget2 to form the retain and forget sets. We randomly split the datasets, using 80% of the samples for training and unlearning, and the remaining 20% for testing. [...] We select 20% of the original TOFU forget05 set as the relearning set over three epochs.
Hardware Specification Yes All experiments are conducted on 8 NVIDIA A6000 GPU cards in a single node.
Software Dependencies No The paper mentions using a 'small GPT-2 model [66]' and 'Adam W [67]' for optimization, but it does not specify software dependencies with version numbers (e.g., Python, PyTorch, CUDA versions).
Experiment Setup Yes For all experiments, we use a linear warm-up learning rate during the first epoch, followed by a linearly decaying learning rate in the remaining epochs. We initialize the process with LLa MA-2 7B and fine-tune the model on TOFU for 5 epochs with a batch size of 32 and a learning rate of 10-5 to obtain the original model. For Forget05, NPO is trained for up to 20 epochs with a learning rate of 10-5. We conducted a grid search for β in the range of [0.05, 0.2] and for λ in the range of [0.5, 1.5]. Sim NPO is trained for 10 epochs with a learning rate of 10-5. The parameter β is grid-searched over the range [1.5, 3.5], γ is searched between [0.0, 2.0] with the default choice γ = 0, and λ is explored within the range [0.05, 0.25].