Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

On Effects of Steering Latent Representation for Large Language Model Unlearning

Authors: Huu-Tien Dang, Tin Pham, Hoang Thanh-Tung, Naoya Inoue

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments demonstrate that Adaptive RMU significantly improves the unlearning performance compared to prior art while incurring no additional computational cost. Experimental results show that Adaptive RMU achieves higher drop-in-accuracy for forget knowledge, maintaining high performance on general knowledge, and enables effective unlearning for most layers without incurring additional computational overhead.
Researcher Affiliation	Academia	Dang Huu-Tien1, Tin Pham1, Hoang Thanh-Tung2, and Naoya Inoue1,3 1Japan Advanced Institute of Science and Technology 2VNU University of Engineering and Technology, Vietnam 3RIKEN
Pseudocode	Yes	Algorithm 1: Adaptive RMU pseudocode
Open Source Code	Yes	Our code is available at https://github.com/RebelsNLU-jaist/llm-unlearning.
Open Datasets	Yes	We use WMDP-Biology and WMDP-Cyber forget datasets as Dforget and Wikitext (Merity et al. 2022) as Dretain for unlearning the LLM. Unlearned models are evaluated on WMDP Q&A datasets and MMLU (Hendrycks et al. 2021).
Dataset Splits	No	The paper mentions using 'WMDP-Biology and WMDP-Cyber forget datasets as Dforget and Wikitext (Merity et al. 2022) as Dretain' for unlearning and 'WMDP Q&A datasets and MMLU (Hendrycks et al. 2021)' for evaluation. However, it does not specify explicit percentages, counts, or a methodology for splitting these datasets into training, validation, or test sets within the scope of this research.
Hardware Specification	Yes	Two NVIDIA A40s with 90GB GPU were used to run the experiments.
Software Dependencies	No	The paper mentions 'Adam W' as an optimizer but does not specify any software libraries or frameworks with version numbers (e.g., PyTorch, TensorFlow, Python version) that were used for implementation.
Experiment Setup	Yes	Models were fine-tuned using Adam W (Loshchilov and Hutter 2019) with learning rate η = 5e 5, batch-size of 4, max sequence len of 512 for WMDP-Biology and 768 for WMDP-Cyber, with T = 500 gradient update steps. The retain weight α = 1200. For the baseline RMU, we follow the previous work and let c = 6.5. We grid search for unlearn layer l from the third to the last layer. For the Adaptive RMU, we grid search for the scaling factor β {2, 3, 5, 10}. We report the performances of Adaptive RMU models with β = 5. We update three layers parameters {l, l 1, l 2} of the model.