Memoria: Resolving Fateful Forgetting Problem through Human-Inspired Memory Architecture
Authors: Sangjun Park, Jinyeong Bak
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The experimental results prove the effectiveness of Memoria in the diverse tasks of sorting, language modeling, and classification, surpassing conventional techniques. |
| Researcher Affiliation | Academia | 1Department of Computer Science and Engineering, Sungkyunkwan University, Suwon, South Korea. Correspondence to: Jin Yeong Bak <jy.bak@skku.edu>. |
| Pseudocode | Yes | Algorithm 1 Retrieve Stage |
| Open Source Code | Yes | 1The implementation of Memoria and all experimental code are publicly available at https://github.com/cosmoquester/memoria |
| Open Datasets | Yes | Secondly, we performed language modeling for token-level on WikiText-103 (Raw) (Merity et al., 2017) and PG-19 (Rae et al., 2020), and character-level on enwik8 (Mahoney, 2006). ... Lastly, we conducted the classification task on the long document classification dataset, Hyperpartisan (Kiesel et al., 2019). |
| Dataset Splits | Yes | We report validation and test set results because of data distribution discrepancies. |
| Hardware Specification | Yes | One or more NVIDIA A100 or A6000 GPUs were used for training. |
| Software Dependencies | No | The paper mentions software like GPT-2 tokenizer, Adam optimizer, linear scheduler, and PyTorch, but does not provide specific version numbers for these software dependencies. |
| Experiment Setup | Yes | For all sorting experiments, a batch size of 32, a warmup rate of 0.06, a learning rate of 2e-4, and an epoch of 5 were used for 80,000 train examples. Memoria parameters used in the experiment were as follows: an initial lifespan of 5, a lifespan extension scale α of 8, and a long-term memory search depth Ndepth of 10 in all cases. |