reproducibilityindex.ai

MEMORYLLM: Towards Self-Updatable Large Language Models

Authors: Yu Wang, Yifan Gao, Xiusi Chen, Haoming Jiang, Shiyang Li, Jingfeng Yang, Qingyu Yin, Zheng Li, Xian Li, Bing Yin, Jingbo Shang, Julian Mcauley

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our evaluations demonstrate the ability of MEMORYLLM to effectively incorporate new knowledge, as evidenced by its performance on model editing benchmarks. Meanwhile, the model exhibits long-term information retention capacity, which is validated through our custom-designed evaluations and long-context benchmarks.
Researcher Affiliation	Collaboration	1UC, San Diego 2Amazon 3UC, Los Angeles.
Pseudocode	Yes	Algorithm 1 Training Strategy for Mitigating Forgetting Problems
Open Source Code	Yes	Our code and model are open-sourced at https://github.com/wangyu-ustc/ Memory LLM.
Open Datasets	Yes	We train our model on the processed version of the C4 dataset (Raffel et al., 2020) from Red-Pajama (Computer, 2023).
Dataset Splits	No	The paper uses the C4 dataset for training and uses existing benchmark datasets (Zs RE, Counter Factual, Long Bench, SQuAD, Natural QA) with their respective evaluation methodologies. However, it does not explicitly provide the specific training/validation/test splits (e.g., percentages or exact counts) they used for their main model's development and evaluation across all phases, beyond describing the subsets used for evaluation tasks.
Hardware Specification	Yes	The training is performed on 8 A100-80GB GPUs for three days.
Software Dependencies	No	The paper mentions using Llama2-7b but does not provide specific version numbers for software dependencies such as Python, PyTorch, or CUDA.
Experiment Setup	Yes	In our instantiation, N = 7,680 and K = 256. (Section 4.5.1)