Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Revealing and Mitigating Over-Attention in Knowledge Editing

Authors: Pinzheng Wang, Zecheng Tang, Keyan Zhou, Juntao Li, Qiaoming Zhu, Min Zhang

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on five frequently used strong LLMs demonstrate the effectiveness of our method, where SADR can significantly mitigate Specificity Failure in the predominant knowledge editing tasks. Table 1 illustrates a significant specificity failure when the edited subject occurs in the context, with the edited model incorrectly outputting the edited object in over 50% of test cases. 5 EXPERIMENTS
Researcher Affiliation	Academia	1School of Computer Science and Technology, Soochow University 2Key Laboratory of Data Intelligence and Advanced Computing, Soochow University EMAIL EMAIL
Pseudocode	Yes	Algorithm 1: The MEMIT Algorithm
Open Source Code	Yes	Code, dataset and an interactive demo notebook: https://github.com/PinzhengWang322/Reveal_Attention_Drift.
Open Datasets	Yes	The dataset we use is a mixture of counterfact datasets from Meng et al. (2022) and Zhang et al. (2024a). Dataset Due to the limited availability of datasets that satisfy the required fields for our tasks, we combine COUNTERFACT (Meng et al., 2022) and Wiki Datacounterfact (Zhang et al., 2024a) with 1,683 factual statements as the testing data.
Dataset Splits	Yes	We test [20, 40, 80] optimization steps with restraining weights γ set at [5e 3, 1e 2, 4e 2, 8e 2] on the validation split. When testing the trade-off between generalization and specificity, we randomly sample 500 data points for evaluation.
Hardware Specification	Yes	All experiments are conducted on eight NVIDIA A100 (40GB) GPUs, with individual edits taking approximately 20 to 80 seconds on a single GPU.
Software Dependencies	Yes	implemented using Easy Edit 2. We build the human evaluation interface with the open-source python web library Django 3
Experiment Setup	Yes	The learning rate is 0.5, optimization steps are 20, and the KL factor ω is 0.0625 across various models. For GPT-J-6b, we edit layer 5, with optimization steps of 80 and a controlling weight γ = 1e 2 for the SADR method.