Accelerating Large-Scale Inference with Anisotropic Vector Quantization

Authors: Ruiqi Guo, Philip Sun, Erik Lindgren, Quan Geng, David Simcha, Felix Chern, Sanjiv Kumar

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 5. Experiments In this section, we show our proposed quantization objective leads to improved performance on maximum inner product search. First, we fix the quantization mechanism and compare traditional reconstruction loss with our proposed loss to show that score-aware loss leads to better retrieval performance and more accurate estimation of maximum inner product values. Next, we compare in fixed-bit-rate settings against QUIPS and LSQ, which are the current state-of-the-art for many MIPS tasks. Finally, we analyze the end-to-end MIPS retrieval performance of our algorithm in terms of its speed-recall trade-off in a standardized hardware environment. We used the benchmark setup from ann-benchmarks.com, which provides 11 competitive baselines with pre-tuned parameters. We plot each algorithm s speed-recall curve and show ours achieves the stateof-the-art.
Researcher Affiliation Industry 1Google Research. Correspondence to: Philip Sun <sunphil@google.com>.
Pseudocode No The paper describes iterative algorithms with numbered steps, but these are presented as descriptive text rather than formal pseudocode blocks or figures labeled 'Algorithm'.
Open Source Code Yes The proposed approach, whose implementation is open-source, Our implementation is open-source and available at https://github.com/google-research/ google-research/tree/master/scann
Open Datasets Yes We use Glove1.2M which is a collection of 1.2 million 100-dimensional word embeddings trained as described in (Pennington et al., 2014).
Dataset Splits No The paper mentions using Glove1.2M and the ANN-Benchmarks setup, which implies standard datasets and procedures, but it does not explicitly state the specific training, validation, and test dataset splits (e.g., percentages or sample counts) used in their experiments.
Hardware Specification Yes Our benchmarks are all conducted on an Intel Xeon W-2135 with a single CPU thread
Software Dependencies No The paper mentions building its implementation on specific techniques like SIMD based ADC and combining it with a vector quantization based tree, and compares against faiss and hnswlib. However, it does not specify version numbers for any of the software dependencies used in their own implementation.
Experiment Setup Yes For all subsequent experiments, we set T = 0.2, which by the limit in Equation (3) corresponds to a value of η = 4.125.