A Human-Inspired Reading Agent with Gist Memory of Very Long Contexts
Authors: Kuang-Huei Lee, Xinyun Chen, Hiroki Furuta, John Canny, Ian Fischer
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate Read Agent against baselines using retrieval methods, using the original long contexts, and using the gist memories. These evaluations are performed on three long-document reading comprehension tasks: Qu ALITY, Narrative QA, and QMSum. |
| Researcher Affiliation | Industry | 1Google Deep Mind. Correspondence to: Kuang-Huei Lee <leekh@google.com>, Ian Fischer <iansf@google.com>. |
| Pseudocode | No | The paper describes the steps of Read Agent using prose and example prompts, but it does not include formal pseudocode blocks or algorithms. |
| Open Source Code | Yes | Project website and demo: read-agent.github.io. We release the prompts for each task on read-agent.github.io. |
| Open Datasets | Yes | We evaluate Read Agent s long-document reading comprehension ability on three long-context question-answering challenges: Qu ALITY (Pang et al., 2022), Narrative QA (Koˇcisk y et al., 2018) and QMSum (Zhong et al., 2021). |
| Dataset Splits | Yes | Although Read Agent does not require any model training, we develop the proposed method on the training sets and test on the validation, test and/or development sets to avoid any risk of overfitting system hyperparameters. |
| Hardware Specification | No | The paper mentions using "instruction-tuned Pa LM 2-L" and "GPT-3.5 Turbo" which are language models, not hardware specifications like GPU models, CPU models, or cloud computing instances. |
| Software Dependencies | No | The paper mentions using "instruction-tuned Pa LM 2-L (Anil et al., 2023)" and "GPT-3.5 Turbo" and "Gemini API embedding model (models/embedding-001)", but it does not provide specific version numbers for general software dependencies like Python, PyTorch, or CUDA. |
| Experiment Setup | Yes | As described in Section 3.1, max words and min words are two episode pagination hyperparameters. Table 8 gives their values for each of the experiments in Section 4. |