reproducibilityindex.ai

MUVERA: Multi-Vector Retrieval via Fixed Dimensional Encoding

Authors: Rajesh Jayaram, Laxman Dhulipala, Majid Hadian, Jason D. Lee, Vahab Mirrokni

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirically, we ﬁnd that FDEs achieve the same recall as prior state-of-the-art heuristics while retrieving 2-5 fewer candidates. Compared to prior state of the art implementations, MUVERA achieves consistently good end-to-end recall and latency across a diverse set of the BEIR retrieval datasets, achieving an average of 10% improved recall with 90% lower latency.
Researcher Affiliation	Collaboration	Laxman Dhulipala Google Research and UMD Majid Hadian Google Deep Mind Rajesh Jayaram Google Research Jason Lee Google Research Vahab Mirrokni Google Research
Pseudocode	Yes	Figure 2: FDE Generation Process. Three Sim Hashes (ksim = 3) split space into six regions labelled A-F (in high-dimensions B = 2ksim, but B = 6 here since d = 2). Fq(Q), Fdoc(P) are shown as B d matrices, where the k-th row is q(k), p(k). The actual FDEs are ﬂattened versions of these matrices. Not shown: inner projections, repetitions, and ﬁll_empty_clusters.
Open Source Code	No	Our end-to-end retrieval engine is implemented in C++ in a proprietary codebase, preventing us from directly releasing it. As described in Section 3.2, we plan to publish a standalone open-source implementation of the FDE generation step upon publication, along with the product quantization code (which is a textbook method) and the ball-carving code.
Open Datasets	Yes	Datasets. Our evaluation includes results from six of the well-studied BEIR [46] information retrieval datasets: MS MARCO [40] (CC BY-SA 4.0), Hotpot QA (CC BY-SA 4.0) [53], NQ (Apache-2.0) [31], Quora (Apache-2.0) [46], Sci Docs (CC BY 4.0) [11], and Argu Ana (Apache-2.0) [47].
Dataset Splits	Yes	Following [43], we use the development set for our experiments on MS MARCO, and use the test set on the other datasets.
Hardware Specification	Yes	Experimental Setup. We run our online experiments on an Intel Sapphire Rapids machine on Google Cloud (c3-standard-176). The machine supports up to 176 hyper-threads.
Software Dependencies	No	Insufficient information. The paper mentions 'implemented in C++' and uses 'Disk ANN [25]', but does not provide specific version numbers for any software components, libraries, or solvers.
Experiment Setup	Yes	We perform a grid search over FDE parameters Rreps {1, 5, 10, 15, 20}, ksim {2, 3, 4, 5, 6}, dproj {8, 16, 32, 64}... Our single-vector retrieval engine uses a scalable implementation [38] of Disk ANN [25]... We build Disk ANN indices by using the uncompressed document FDEs with a maximum degree of 200 and a build beam-width of 600... Based on these empirical results, we choose the value of τ = 0.7 in our end-to-end experiments.