ConTextual Masked Auto-Encoder for Dense Passage Retrieval

Authors: Xing Wu, Guangyuan Ma, Meng Lin, Zijia Lin, Zhongyuan Wang, Songlin Hu

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct experiments on large-scale passage retrieval benchmarks and show considerable improvements over strong baselines, demonstrating the high efficiency of Co T-MAE.
Researcher Affiliation Collaboration Xing Wu1,2,3 *, Guangyuan Ma1,2 *, Meng Lin1,2, Zijia Lin3, Zhongyuan Wang3, Songlin Hu1,2 1 Institute of Information Engineering, Chinese Academy of Sciences 2 School of Cyber Security, University of Chinese Academy of Sciences 3 Kuaishou Technology
Pseudocode No The paper describes the methodology using text and mathematical formulas but does not provide pseudocode or an algorithm block.
Open Source Code Yes Our code is available at https://github.com/caskcsg/ir/tree/main/cotmae.
Open Datasets Yes We fine-tune the pre-trained Co T-MAE on MS-MARCO passage ranking (Nguyen et al. 2016), Natural Question (Kwiatkowski et al. 2019) and TREC Deep Learning (DL) Track 2020 (Craswell et al. 2020) tasks for evaluation. Following co Condenser(Gao and Callan 2021b), we use the MS-MARCO corpus released in (Qu et al. 2020), following Rocket QA(Qu et al. 2020), we use the NQ version created by DPR(Karpukhin et al. 2020).
Dataset Splits No The paper mentions using a widely adopted evaluation pipeline (Tevatron) and details pre-training steps and batch sizes, but does not explicitly describe train/validation/test splits, their percentages, or sample counts. It refers to using specific versions of datasets but not how they were partitioned for reproducibility.
Hardware Specification Yes We train for 4 days with a global batch size of 1024 on 8 Tesla A100 GPUs.
Software Dependencies No The paper does not provide specific version numbers for software dependencies beyond mentioning general tools like NLTK or frameworks like BERT/PyTorch implicitly.
Experiment Setup Yes We pre-train up to 1200k steps using Adam W optimizer, with a learning rate of 1e-4, and a linear schedule with warmup ratio 0.1. We train for 4 days with a global batch size of 1024 on 8 Tesla A100 GPUs. For fine-tuning, the similarity of a query-passage pair < q, p > is defined as an inner product: s(q, p) = fq(q) fp(p) Query and passage encoders are fine-tuned on the retrieval task s training corpus with a contrastive loss: L = log exp (s (q, p+)) exp (s (q, p+)) + P l exp s q, p l