reproducibilityindex.ai

HypeR: Multitask Hyper-Prompted Training Enables Large-Scale Retrieval Generalization

Authors: ZeFeng Cai, Chongyang Tao, Tao Shen, Can Xu, Xiubo Geng, Xin Alex Lin, Liang He, Daxin Jiang

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments show our model attains better retrieval performance across different tasks and better zero-shot transfer ability compared with various previous methods. 3 EXPERIMENTS Benchmark Datasets We use two publicly available retrieval datasets for evaluation, including KILT (Petroni et al., 2021) and BEIR (Thakur et al., 2021).
Researcher Affiliation	Collaboration	Department of Computer Science, East China Normal University1 Microsoft Corporation, Beijing, China2
Pseudocode	No	The paper includes mathematical equations and descriptions of processes, but no structured pseudocode or algorithm blocks are explicitly present or labeled.
Open Source Code	Yes	Our Code is available at https://github.com/oklen/Hyper.
Open Datasets	Yes	We use two publicly available retrieval datasets for evaluation, including KILT (Petroni et al., 2021) and BEIR (Thakur et al., 2021).
Dataset Splits	No	The paper mentions using KILT and BEIR datasets but does not explicitly provide specific training/validation/test dataset splits (percentages, sample counts, or detailed splitting methodology) within the text.
Hardware Specification	Yes	All experiments are conducted on eight A100 GPUs. The inference time is measured on a machine with Intel Xeon CPU E5-2678.
Software Dependencies	No	The paper mentions using Python and standard libraries implicitly (e.g., PyTorch for deep learning models), and refers to tools like Anserini, but does not provide specific version numbers for ancillary software dependencies.
Experiment Setup	Yes	The learning rate of the backbone network and the modules of HYPER is set to 2 10 5 by following Formal et al. (2021b) and 1 10 3 selected from {10 1, 10 2, 10 3}, respectively. The temperature τ is set to e, λq is set to 0.3, λd is set to 0.1. dq is set to 400 and dp is set to 100. The number of train epochs is set up to 10 epochs, both max document length and max query length are set to 512 to fit the task with a very long query, and the batch size is set to 256. For each query, we provide 1 positive sample and 19 negative samples for training.We set the sequence length of each basic prompt to 100 selected from {10, 50, 100}. The λc and the number of shared basic prompts N are tuned and we finally select 0.1 and 20 respectively.