reproducibilityindex.ai

Memory-Based Model Editing at Scale

Authors: Eric Mitchell, Charles Lin, Antoine Bosselut, Christopher D Manning, Chelsea Finn

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments indicate that SERAC consistently outperforms past approaches to model editing by a substantial margin on the three most difficult problems. Code, data, and additional project information will be made available at https://sites.google.com/view/serac-editing.
Researcher Affiliation	Academia	1Stanford University Department of Computer Science 2EPFL School of Computer and Communication Sciences.
Pseudocode	No	The paper does not include a clearly labeled pseudocode or algorithm block.
Open Source Code	No	Code, data, and additional project information will be made available at https://sites.google.com/view/serac-editing.
Open Datasets	Yes	The QA setting uses the zs RE question-answering problem introduced by De Cao et al. (2021). We use this dataset as a starting point of reference to connect our evaluations with prior work. ... We introduce the FC setting, building on the Vitamin C fact verification dataset (Schuster et al., 2021)... As a base model, we use the BERTbase model trained by De Cao et al. (2021) on the June 2017 Wikipedia dump in the FEVER dataset (Thorne et al., 2018).
Dataset Splits	Yes	Data were randomly split (by entity) into 90-5-5 train/val/test splits.
Hardware Specification	No	The paper mentions models like T5-large and BERT-base but does not specify the hardware (e.g., GPU models, CPU types) used for running the experiments.
Software Dependencies	No	The paper mentions using 'Huggingface (Wolf et al., 2019) implementations' and specific models like 'distilbert-base-cased (Sanh et al., 2019)' but does not provide specific version numbers for the software libraries or frameworks used (e.g., PyTorch version, Transformers library version).
Experiment Setup	Yes	We use Adam with an outer-loop learning rate of 1 10 5, and an initial inner-loop learning of 1 10 2 which is learned in the outer loop. ... All scope classifier and counterfactual models are trained using Adam with a learning rate of 1 10 5.