ReFusion: Improving Natural Language Understanding with Computation-Efficient Retrieval Representation Fusion

Authors: Shangyu Wu, Ying Xiong, Yufei CUI, Xue Liu, Buzhou Tang, Tei-Wei Kuo, Chun Jason Xue

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results demonstrate that the proposed Re Fusion can achieve superior and robust performance in various NKI tasks. Finally, we conducted comprehensive experiments on 15 different NKI tasks.
Researcher Affiliation Academia 1 City University of Hong Kong 2 Harbin Institute of Technology, Shenzhen 3 MILA, Mc Gill University 4 National Taiwan University 5 Mohamed bin Zayed University of Artificial Intelligence
Pseudocode No The paper does not contain any clearly labeled pseudocode or algorithm blocks. It describes steps and formulas in paragraph text and mathematical equations.
Open Source Code Yes Codes are available at 1. (Footnote points to https://github.com/luffy06/ReFusion)
Open Datasets Yes Datasets We conduct comprehensive experiments across 15 NKI tasks, including 8 tasks from GLUE benchmark(Wang et al., 2019), SNLI, SST-5, MR, CR, MNLI, MNLI-mm, Subj and TREC. All these datasets configurations are identical to that of LM-BFF (Gao et al., 2021).
Dataset Splits Yes All these datasets configurations are identical to that of LM-BFF (Gao et al., 2021). For the upper level, this paper aims to find the optimal combination of ranking schemes that maximizes performance on the validation set.
Hardware Specification Yes The proposed method was implemented using PyTorch framework, utilizing the computational power of two NVIDIA V100 GPUs.
Software Dependencies No The paper mentions "PyTorch framework" but does not specify its version number or any other software dependencies with their versions.
Experiment Setup Yes The hyperparameters are listed as follows: the learning rate is 1e-5, the batch size is 32, the maximum sequence length is 128, the maximum steps are 1000, the number k of similar sentences retrieved is set to 64 and we save the last checkpoint. We use AdamW as the optimizer. The models are based on RoBERTa-large for fair comparison with LM-BFF.