MEMORYLLM: Towards Self-Updatable Large Language Models

Authors: Yu Wang, Yifan Gao, Xiusi Chen, Haoming Jiang, Shiyang Li, Jingfeng Yang, Qingyu Yin, Zheng Li, Xian Li, Bing Yin, Jingbo Shang, Julian Mcauley

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our evaluations demonstrate the ability of MEMORYLLM to effectively incorporate new knowledge, as evidenced by its performance on model editing benchmarks. Meanwhile, the model exhibits long-term information retention capacity, which is validated through our custom-designed evaluations and long-context benchmarks.
Researcher Affiliation Collaboration 1UC, San Diego 2Amazon 3UC, Los Angeles.
Pseudocode Yes Algorithm 1 Training Strategy for Mitigating Forgetting Problems
Open Source Code Yes Our code and model are open-sourced at https://github.com/wangyu-ustc/ Memory LLM.
Open Datasets Yes We train our model on the processed version of the C4 dataset (Raffel et al., 2020) from Red-Pajama (Computer, 2023).
Dataset Splits No The paper uses the C4 dataset for training and uses existing benchmark datasets (Zs RE, Counter Factual, Long Bench, SQuAD, Natural QA) with their respective evaluation methodologies. However, it does not explicitly provide the specific training/validation/test splits (e.g., percentages or exact counts) they used for their main model's development and evaluation across all phases, beyond describing the subsets used for evaluation tasks.
Hardware Specification Yes The training is performed on 8 A100-80GB GPUs for three days.
Software Dependencies No The paper mentions using Llama2-7b but does not provide specific version numbers for software dependencies such as Python, PyTorch, or CUDA.
Experiment Setup Yes In our instantiation, N = 7,680 and K = 256. (Section 4.5.1)