HypeR: Multitask Hyper-Prompted Training Enables Large-Scale Retrieval Generalization

Authors: ZeFeng Cai, Chongyang Tao, Tao Shen, Can Xu, Xiubo Geng, Xin Alex Lin, Liang He, Daxin Jiang

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments show our model attains better retrieval performance across different tasks and better zero-shot transfer ability compared with various previous methods. 3 EXPERIMENTS Benchmark Datasets We use two publicly available retrieval datasets for evaluation, including KILT (Petroni et al., 2021) and BEIR (Thakur et al., 2021).
Researcher Affiliation Collaboration Department of Computer Science, East China Normal University1 Microsoft Corporation, Beijing, China2
Pseudocode No The paper includes mathematical equations and descriptions of processes, but no structured pseudocode or algorithm blocks are explicitly present or labeled.
Open Source Code Yes Our Code is available at https://github.com/oklen/Hyper.
Open Datasets Yes We use two publicly available retrieval datasets for evaluation, including KILT (Petroni et al., 2021) and BEIR (Thakur et al., 2021).
Dataset Splits No The paper mentions using KILT and BEIR datasets but does not explicitly provide specific training/validation/test dataset splits (percentages, sample counts, or detailed splitting methodology) within the text.
Hardware Specification Yes All experiments are conducted on eight A100 GPUs. The inference time is measured on a machine with Intel Xeon CPU E5-2678.
Software Dependencies No The paper mentions using Python and standard libraries implicitly (e.g., PyTorch for deep learning models), and refers to tools like Anserini, but does not provide specific version numbers for ancillary software dependencies.
Experiment Setup Yes The learning rate of the backbone network and the modules of HYPER is set to 2 10 5 by following Formal et al. (2021b) and 1 10 3 selected from {10 1, 10 2, 10 3}, respectively. The temperature τ is set to e, λq is set to 0.3, λd is set to 0.1. dq is set to 400 and dp is set to 100. The number of train epochs is set up to 10 epochs, both max document length and max query length are set to 512 to fit the task with a very long query, and the batch size is set to 256. For each query, we provide 1 positive sample and 19 negative samples for training.We set the sequence length of each basic prompt to 100 selected from {10, 50, 100}. The λc and the number of shared basic prompts N are tuned and we finally select 0.1 and 20 respectively.