reproducibilityindex.ai

Locating and Editing Factual Associations in GPT

Authors: Kevin Meng, David Bau, Alex Andonian, Yonatan Belinkov

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We analyze the storage and recall of factual associations in autoregressive transformer language models, ﬁnding evidence that these associations correspond to localized, directly-editable computations. We ﬁrst develop a causal intervention for identifying neuron activations that are decisive in a model s factual predictions. This reveals a distinct set of steps in middle-layer feed-forward modules that mediate factual predictions while processing subject tokens. To test our hypothesis that these computations correspond to factual association recall, we modify feedforward weights to update speciﬁc factual associations using Rank-One Model Editing (ROME). We ﬁnd that ROME is effective on a standard zero-shot relation extraction (zs RE) model-editing task. We also evaluate ROME on a new dataset of difﬁcult counterfactual assertions, on which it simultaneously maintains both speciﬁcity and generalization, whereas other methods sacriﬁce one or another.
Researcher Affiliation	Academia	David Bau Northeastern University Alex Andonian Yonatan Belinkov Technion IIT
Pseudocode	No	The paper describes its method steps (e.g., Step 1: Choosing k to Select the Subject, Step 2: Choosing v to Recall the Fact, Step 3: Inserting the Fact) in Section 3.1 but these are presented as descriptive text rather than formal pseudocode or algorithm blocks.
Open Source Code	Yes	The code, dataset, visualizations, and an interactive demo notebook are available at https://rome.baulab.info/.
Open Datasets	Yes	The code, dataset, visualizations, and an interactive demo notebook are available at https://rome.baulab.info/. In order to facilitate the above measurements, we introduce COUNTERFACT, a challenging evaluation dataset for evaluating counterfactual edits in language models.
Dataset Splits	No	Our evaluation slice contains 10,000 records, each containing one factual statement, its paraphrase, and one unrelated factual statement. Table 4 showcases quantitative results on GPT-2 XL (1.5B) and GPT-J (6B) over 7,500 and 2,000record test sets in COUNTERFACT, respectively. The paper specifies the size of its evaluation/test sets, but does not provide explicit details about how the overall data was split into training, validation, and testing sets, or refer to standard splits for these portions.
Hardware Specification	Yes	All experiments were run on a single NVIDIA 3090 GPU, with the exception of GPT-J, which was run on an NVIDIA A100 GPU at the Technion, and GPT-2 XL baselines, which were run on a cluster of A100s at Northeastern. (Appendix E.1)
Software Dependencies	No	The codebase is written in Python using PyTorch. (Appendix E.1) While it mentions Python and PyTorch, it does not specify any version numbers for these or other software components.
Experiment Setup	Yes	We optimize v for 100 steps using Adam (Kingma & Ba, 2015) with an initial learning rate of 0.0001, linearly decaying to 0. (Appendix E.5) Our Adam optimizer (Kingma & Ba, 2015) uses a learning rate of 1e-4 and a batch size of 1. We trained for 100 steps for each fact. (Appendix E.3)