reproducibilityindex.ai

Lift Yourself Up: Retrieval-augmented Text Generation with Self-Memory

Authors: Xin Cheng, Di Luo, Xiuying Chen, Lemao Liu, Dongyan Zhao, Rui Yan

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate the effectiveness of Selfmem on three distinct text generation tasks: neural machine translation, abstractive text summarization, and dialogue generation, under two generation paradigms: fine-tuned small model and few-shot LLM. Our approach achieves state-of-the-art results in four directions in JRC-Acquis translation dataset, 50.3 ROUGE-1 in XSum, and 62.9 ROUGE-1 in Big Patent, demonstrating the potential of self-memory in enhancing retrieval-augmented generation models.
Researcher Affiliation	Collaboration	Xin Cheng1 Di Luo2 Xiuying Chen3 Lemao Liu4 Dongyan Zhao1 Rui Yan2 1 Peking University 2 Remin University of China 3 KAUST 4 Tencent AI Lab
Pseudocode	Yes	Algorithm 1 Selfmem Framework
Open Source Code	Yes	Code and data available at: https://github.com/Hannibal046/Self Memory
Open Datasets	Yes	We assess the performance of Selfmem on three generation tasks, utilizing a total of seven datasets. Translation. We evaluate our framework on JRC-Acquis datasets [82], a collection of parallel legislative text of European Union Law... Summarization. We evaluate on 2 summarization datasets: 1) XSum [60]... 2) Big Patent [73]... Dialogue. We experiment on Daily Dialog [44]...
Dataset Splits	Yes	Table 7: Dataset statistics for three tasks. Task Dataset #Train #Dev #Test ... JRC (en de) 663,487 2,454 2,483 ... XSum 204,045 11,332 11,334
Hardware Specification	Yes	All experiments are conducted on the same device, equipped with one NVIDIA A100 GPU and one AMD EPYC 7V13 64-Core Processor.
Software Dependencies	No	The paper mentions software components like SACREBLEU, Adafactor, Transformer, XLM-Rbase, BARTbase, BRIO, and RoBERTa, but does not specify their version numbers or the versions of underlying programming languages or libraries (e.g., Python, PyTorch).
Experiment Setup	Yes	The hyper-parameter setting follows [17] with dropout 0.1, label smoothing 0.1, gradient clipping 1.0, Adafactor [74], warm-up steps 4000, maximum learning rate 4.4e-2 and training epochs 30 for total. The maximum input length is 512 for XSum and 1024 for Big Patent.