Unsupervised Cross-Task Generalization via Retrieval Augmentation

Authors: Bill Yuchen Lin, Kangmin Tan, Chris Miller, Beiwen Tian, Xiang Ren

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our results and analysis show that it significantly outperforms both non-retrieval methods and other baseline methods. Our extensive experiments show that the proposed Re Cross outperforms the baseline methods by a large margin.
Researcher Affiliation Academia University of Southern California Tsinghua University {yuchen.lin,kangmint,millercs,xiangren}@usc.edu
Pseudocode Yes Algorithm 1: Distant Supervision Creation
Open Source Code Yes Our data, code, and supplementary materials are at https://inklab.usc.edu/Re Cross/.
Open Datasets Yes We follow Sanh et al. (2021) to use the templates from Prompt Source (Bach et al., 2022) for converting data of different types of NLP tasks to text-to-text formats. The data we used are all open-source and publicly available via the datasets library from Hugging Face.
Dataset Splits No The paper mentions using 'held-out labeled data' for evaluation and 'query sets' for retrieval, but it does not provide specific percentages or counts for training, validation, and test dataset splits needed for reproduction.
Hardware Specification No The paper mentions the use of 'popular affordable GPUs' for fine-tuning but does not provide specific details on the GPU models, CPU models, or other hardware specifications used for running experiments.
Software Dependencies No The paper mentions using the 'FAISS library' and 'RoBERTa model' but does not specify their version numbers or other software dependencies with specific versions.
Experiment Setup Yes In our main experiments, we use |Qi| = 16 query examples for each unseen task Ui and retrieve |Ri| = 512 examples for augmenting BART0. In the fine-tuning stage, we use a learning rate of 1e-6 and a batch size of 4 to continually fine-tune all layers of BART0 for 2 epochs. As for re-ranking, we set the upsampling ratio µ = 2, meaning that we first retrieve 1024 examples for reranking and use the top 512 ones as the final retrieved data.