xRAG: Extreme Context Compression for Retrieval-augmented Generation with One Token

Authors: Xin Cheng, Xun Wang, Xingxing Zhang, Tao Ge, Si-Qing Chen, Furu Wei, Huishuai Zhang, Dongyan Zhao

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results demonstrate that x RAG achieves an average improvement of over 10% across six knowledge-intensive tasks, compatible with various language model backbones, ranging from a dense 7B model to an 8x7B Mixture of Experts configuration.
Researcher Affiliation Collaboration 1 Peking University 2 Microsoft 3 National Key Laboratory of General Artificial Intelligence
Pseudocode No The paper describes the x RAG architecture and training strategy in text and diagrams, but it does not include pseudocode or clearly labeled algorithm blocks.
Open Source Code Yes Code is available at: https://github.com/Hannibal046/x RAG.
Open Datasets Yes Natural Questions [41], Trivia QA [33], and Web Questions [8]...Hotpot QA [81]...Truthful QA [50]...Fact KG [39].
Dataset Splits No The paper specifies training datasets and test datasets for evaluation but does not explicitly detail the use of a separate validation dataset or its splits.
Hardware Specification Yes These experiments were performed on the same computational hardware, specifically an Nvidia A100 and an AMD EPYC 7V12 64-Core Processor.
Software Dependencies No The paper mentions several models and tools like Mistral-7b, Mixtral-8x7b, SFR, ColBERT-v2, and Torch Profiler, but it does not specify their version numbers or other software dependencies with version details.
Experiment Setup Yes In Table 9 and Table 10, we list the hyperparameters for Paraphrase Pretraining and Context-aware Instruction Tuning.