reproducibilityindex.ai

Unsupervised Cross-Task Generalization via Retrieval Augmentation

Authors: Bill Yuchen Lin, Kangmin Tan, Chris Miller, Beiwen Tian, Xiang Ren

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our results and analysis show that it signiﬁcantly outperforms both non-retrieval methods and other baseline methods. Our extensive experiments show that the proposed Re Cross outperforms the baseline methods by a large margin.
Researcher Affiliation	Academia	University of Southern California Tsinghua University {yuchen.lin,kangmint,millercs,xiangren}@usc.edu
Pseudocode	Yes	Algorithm 1: Distant Supervision Creation
Open Source Code	Yes	Our data, code, and supplementary materials are at https://inklab.usc.edu/Re Cross/.
Open Datasets	Yes	We follow Sanh et al. (2021) to use the templates from Prompt Source (Bach et al., 2022) for converting data of different types of NLP tasks to text-to-text formats. The data we used are all open-source and publicly available via the datasets library from Hugging Face.
Dataset Splits	No	The paper mentions using 'held-out labeled data' for evaluation and 'query sets' for retrieval, but it does not provide specific percentages or counts for training, validation, and test dataset splits needed for reproduction.
Hardware Specification	No	The paper mentions the use of 'popular affordable GPUs' for fine-tuning but does not provide specific details on the GPU models, CPU models, or other hardware specifications used for running experiments.
Software Dependencies	No	The paper mentions using the 'FAISS library' and 'RoBERTa model' but does not specify their version numbers or other software dependencies with specific versions.
Experiment Setup	Yes	In our main experiments, we use \|Qi\| = 16 query examples for each unseen task Ui and retrieve \|Ri\| = 512 examples for augmenting BART0. In the ﬁne-tuning stage, we use a learning rate of 1e-6 and a batch size of 4 to continually ﬁne-tune all layers of BART0 for 2 epochs. As for re-ranking, we set the upsampling ratio µ = 2, meaning that we ﬁrst retrieve 1024 examples for reranking and use the top 512 ones as the ﬁnal retrieved data.