reproducibilityindex.ai

Fast Model Editing at Scale

Authors: Eric Mitchell, Charles Lin, Antoine Bosselut, Chelsea Finn, Christopher D Manning

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments with T5, GPT, BERT, and BART models show that MEND is the only approach to model editing that effectively edits the behavior of models with more than 10 billion parameters.
Researcher Affiliation	Academia	Eric Mitchell, Charles Lin, Antoine Bosselut, Chelsea Finn, Christopher D. Manning Stanford University eric.mitchell@cs.stanford.edu
Pseudocode	Yes	Algorithm 1 MEND Training Algorithm 2 MEND Edit Procedure
Open Source Code	Yes	Code available at https://sites.google.com/view/mend-editing.
Open Datasets	Yes	Specifically, for seq2seq models, we use the zs RE question-answering dataset (Levy et al., 2017) ... For classification models (e.g., BERT), we use the FEVER fact-checking dataset (Thorne et al., 2018)...
Dataset Splits	Yes	For all algorithms, we use early stopping to end training early if the validation loss L = cedit Le + Lloc) does not decrease for 20000 steps on a subset of 500 validation examples, with a maximum number of training steps of 500,000.
Hardware Specification	Yes	All runs are trained entirely on a single NVIDIA RTX Titan or A40 GPU.
Software Dependencies	Yes	We use Py Torch (Paszke et al., 2019) for all experiments, specifically using the Higher library (Grefenstette et al., 2019) in order to implement the bi-level optimization in ENN as well as the inner loop of model editing for all algorithms.
Experiment Setup	Yes	We use edit learning rates of 5e-6 for GPT-Neo and GPT-J and 1e-4 for T5 models, and 1e-6 for the smaller models... We use a batch size of 10 (with gradient accumulation) and the seed 0 for all experiments.