Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Unlearning-Aware Minimization

Authors: Hoki Kim, Keonwoo Kim, Sungwon Chae, Sangwon Yoon

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments demonstrate that UAM outperforms existing methods across diverse benchmarks, including image classification datasets (CIFAR-10, CIFAR-100, Tiny Image Net) and multiple-choice question-answering benchmarks for large language models (WMDP-Bio, WMDP-Cyber).
Researcher Affiliation Collaboration Hoki Kim Chung-Ang University EMAIL Keonwoo Kim NAVER Digital Healthcare LAB EMAIL Sungwon Chae Seoul National University EMAIL Sangwon Yoon Ministry of Justice, Republic of Korea EMAIL
Pseudocode Yes Algorithm 1 RMU [22] Require: Model h, frozen weights w, trainable weights w, forget input xf, retain input xr, learning rate η, hyperparameters c, α 1: zf h(xf, w) 2: u v/||v||2, where vi U(0, 1) 3: Lf = ||zf cu||2 2 Forget loss 4: zr h(xr, w), zr h(xr, w) 5: Lr = ||zr zr||2 2 Retain loss 6: w w η [Lf + αLr] Algorithm 2 UAM Require: Model h, frozen weights w, trainable weights w, forget input xf, retain input xr, learning rate η, hyperparameter ρ 1: zf h(xf, w), zf h(xf, w) 2: Lf = ||zf zf||2 2 Forget loss 3: ˆw w + ρ Lf || Lf ||2 2 Inner maximization 4: zr h(xr, ˆw), zr h(xr, w) 5: Lr = ||zr zr||2 2 Retain loss 6: w w η [I γPf] Lr (15)
Open Source Code Yes To promote reproducibility and benchmarking within the machine unlearning community, we release implementations of existing baseline unlearning methods, along with our proposed framework, available at: https://github.com/Harry24k/ machine-unlearning-pytorch.
Open Datasets Yes Extensive experiments demonstrate that UAM outperforms existing methods across diverse benchmarks, including image classification datasets (CIFAR-10, CIFAR-100, Tiny Image Net) and multiple-choice question-answering benchmarks for large language models (WMDP-Bio, WMDP-Cyber).
Dataset Splits Yes For each dataset, we evaluate two unlearning scenarios: class-wise forgetting and random data forgetting. In the class-wise forgetting scenario, the forget set Df consists of all training samples from a single class. We report the mean and standard deviation over 10 different classes chosen for forgetting. In the random data forgetting scenario, Df consists of randomly sampled training examples across all classes. Results are averaged over three different random seeds. ... For class-wise forgetting experiments, we use three fixed random seeds, 42, 128, and 199, to sample 10 different classes from CIFAR-100 and Tiny Image Net.
Hardware Specification Yes For the CIFAR-10 dataset, all experiments were performed on a single NVIDIA RTX 4090 GPU with 24 GB of memory. The experiments on Tiny Image Net utilized six NVIDIA Titan V GPUs. ... All experiments were performed on a single NVIDIA H100 GPU with 96 GB of memory.
Software Dependencies No The paper mentions "PyTorch" as an automatic differentiation framework, and "Zephyr-7B-β [30]" as a baseline model, but it does not specify version numbers for any software libraries or dependencies. For example, it does not state "PyTorch 1.x" or "Python 3.x".
Experiment Setup Yes All models are trained using SGD with an initial learning rate of 0.1. The learning rate is reduced by a factor of 0.1 at epochs 100 and 150, for a total of 200 training epochs. We use a momentum of 0.9 and a weight decay of 5 10 4. ... For LLM unlearning task... The learning rate is set to 5 10 5. Following [22], we set β = 1.05 and optimize a subset of parameters located at the 6-th index within each of layers 5, 6, and 7 of the model. Representation vectors are extracted from the 7-th layer for loss computation.