reproducibilityindex.ai

Model-enhanced Vector Index

Authors: Hailin Zhang, Yujing Wang, Qi Chen, Ruiheng Chang, Ting Zhang, Ziming Miao, Yingyan Hou, Yang Ding, Xupeng Miao, Haonan Wang, Bochen Pang, Yuefeng Zhan, Hao Sun, Weiwei Deng, Qi Zhang, Fan Yang, Xing Xie, Mao Yang, Bin CUI

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We empirically show that our model achieves better performance on the commonly used academic benchmarks MSMARCO Passage and Natural Questions, with comparable serving latency to dense retrieval solutions.
Researcher Affiliation	Collaboration	1School of Computer Science & Key Lab of High Confidence Software Technologies, Peking University 2Microsoft 3Aerospace Information Research Institute & Key Laboratory of Target Cognition and Application Technology, Chinese Academy of Sciences 4Institute of Information Engineering, Chinese Academy of Sciences 5Carnegie Mellon University 6National University of Singapore 7National Engineering Laboratory for Big Data Analysis and Applications, Peking University
Pseudocode	No	No pseudocode or clearly labeled algorithm blocks were found.
Open Source Code	Yes	The code of MEVI is available at: https://github.com/Hugo ZHL/MEVI.
Open Datasets	Yes	We conduct experiments on two widely used large-scale document retrieval benchmarks, i.e. MSMARCO [34] Passage Retrieval dataset and Natural Questions (NQ) [22] dataset.
Dataset Splits	Yes	For both datasets, we use their predetermined training and validation splits for evaluation.
Hardware Specification	Yes	All experiments of NCI are conducted on an NVIDIA V100 GPU cluster with 32GB memory per GPU, and each job runs on 8 GPUs.
Software Dependencies	No	The paper mentions software like scikit-learn, Faiss, and ONNX Runtime, but does not specify their version numbers, which are necessary for full reproducibility.
Experiment Setup	Yes	We use the same optimizer and learning rate as NCI; however, we use a larger batch size, 256 per GPU, and disable Rdrop to reduce the overall training cost. For inference, we use beam search with a beam size of 10 to 1000.