reproducibilityindex.ai

RECOMP: Improving Retrieval-Augmented LMs with Context Compression and Selective Augmentation

Authors: Fangyuan Xu, Weijia Shi, Eunsol Choi

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our approach on language modeling task and open domain question answering task. We achieve a compression rate of as low as 6% with minimal loss in performance for both tasks, significantly outperforming the off-the-shelf summarization models.
Researcher Affiliation	Academia	Fangyuan Xu1, Weijia Shi2, Eunsol Choi1 Department of Computer Science 1The University of Texas at Austin, 2University of Washington {fangyuan,eunsol}@utexas.edu , swj0419@cs.washington.edu
Pseudocode	Yes	Figure 2: Learning an extractive compressor for language modeling task. Figure 3: Learning an abstractive compressor for language modeling task.
Open Source Code	Yes	Our code is available at https://github.com/carriex/recomp.
Open Datasets	Yes	For the language modeling task, we generate training data using the training split of the Wikitext-103 dataset... Natural Questions (NQ) (Kwiatkowski et al., 2019), Trivia QA (Joshi et al., 2017)) and Hotpot QA (Yang et al., 2018).
Dataset Splits	Yes	We report results on development set of NQ, test set of Trivia QA and randomly sampled 500 examples from Hotpot QA development set. Table 5: Training data statistics for abstractive and extractive compressors. NQ Train 42,149 Validation 9,769, TQA Train 70,032 Validation 8,753, Hotpot QA Train 24,526 Validation 3,068, Wikitext Train 1,398,318 Validation 1,5483.
Hardware Specification	Yes	We run FLAN-UL2 on 4 A40 GPUs. For compression, we run contriver and T5 on a single A40 GPU (Table 6).
Software Dependencies	No	The paper mentions 'Transformers', 'sentence-transformer library', 'spaCy', and 'NLTK' but does not provide specific version numbers for these software dependencies.
Experiment Setup	Yes	We train with Adam optimizer (Kingma & Ba, 2014), using a batch size of 64, learning rate of 2e-5 and 1000 warmup steps for 3 epochs. We train abstractive summarizer with Adam optimizer (Kingma & Ba, 2014), using a batch size of 16, learning rate of 1e-5 and 1000 warmup steps for 3 epochs.