Memorizing Documents with Guidance in Large Language Models
Authors: Bumjin Park, Jaesik Choi
IJCAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on Wikitext-103-v1 with Pythia1B show that the proposed methods provide different memory entries for documents and high recall of document-related content in generation with trained document-wise memories. |
| Researcher Affiliation | Academia | Bumjin Park 1 and Jaesik Choi1,2 1KAIST AI 2INEEJI {bumjin, jaesik.choi}@kaist.ac.kr |
| Pseudocode | No | The paper does not include a dedicated section or figure for pseudocode or algorithm blocks. |
| Open Source Code | Yes | We make the source code publicly available.2 https://github.com/fxnnxc/Doc Guidance LLM |
| Open Datasets | Yes | We train Pythia 1B [Biderman et al., 2023] to memorize Wikitext-103-v1 [Merity et al., 2017] |
| Dataset Splits | No | The paper mentions using Wikitext-103-v1 but does not explicitly provide details about specific train/validation/test dataset splits used for the experiments. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU models, CPU types, or memory used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., Python, PyTorch, CUDA versions). |
| Experiment Setup | Yes | We individually train document-wise memories for 10, 20, and 50 documents with guidance α = 0.1 and τ = 2.5. For baselines, we train two types of memory modules without guidance. Shared is the MLP in Equation 3, and Add is a module that directly adds differential memory entries. We also evaluate three activation types for document memory entries: Re LU, Tanh, and Sigmoid, which affect memory selections. |