MEMORYLLM: Towards Self-Updatable Large Language Models
Authors: Yu Wang, Yifan Gao, Xiusi Chen, Haoming Jiang, Shiyang Li, Jingfeng Yang, Qingyu Yin, Zheng Li, Xian Li, Bing Yin, Jingbo Shang, Julian Mcauley
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our evaluations demonstrate the ability of MEMORYLLM to effectively incorporate new knowledge, as evidenced by its performance on model editing benchmarks. Meanwhile, the model exhibits long-term information retention capacity, which is validated through our custom-designed evaluations and long-context benchmarks. |
| Researcher Affiliation | Collaboration | 1UC, San Diego 2Amazon 3UC, Los Angeles. |
| Pseudocode | Yes | Algorithm 1 Training Strategy for Mitigating Forgetting Problems |
| Open Source Code | Yes | Our code and model are open-sourced at https://github.com/wangyu-ustc/ Memory LLM. |
| Open Datasets | Yes | We train our model on the processed version of the C4 dataset (Raffel et al., 2020) from Red-Pajama (Computer, 2023). |
| Dataset Splits | No | The paper uses the C4 dataset for training and uses existing benchmark datasets (Zs RE, Counter Factual, Long Bench, SQuAD, Natural QA) with their respective evaluation methodologies. However, it does not explicitly provide the specific training/validation/test splits (e.g., percentages or exact counts) they used for their main model's development and evaluation across all phases, beyond describing the subsets used for evaluation tasks. |
| Hardware Specification | Yes | The training is performed on 8 A100-80GB GPUs for three days. |
| Software Dependencies | No | The paper mentions using Llama2-7b but does not provide specific version numbers for software dependencies such as Python, PyTorch, or CUDA. |
| Experiment Setup | Yes | In our instantiation, N = 7,680 and K = 256. (Section 4.5.1) |