Accelerating Large-Scale Inference with Anisotropic Vector Quantization
Authors: Ruiqi Guo, Philip Sun, Erik Lindgren, Quan Geng, David Simcha, Felix Chern, Sanjiv Kumar
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 5. Experiments In this section, we show our proposed quantization objective leads to improved performance on maximum inner product search. First, we fix the quantization mechanism and compare traditional reconstruction loss with our proposed loss to show that score-aware loss leads to better retrieval performance and more accurate estimation of maximum inner product values. Next, we compare in fixed-bit-rate settings against QUIPS and LSQ, which are the current state-of-the-art for many MIPS tasks. Finally, we analyze the end-to-end MIPS retrieval performance of our algorithm in terms of its speed-recall trade-off in a standardized hardware environment. We used the benchmark setup from ann-benchmarks.com, which provides 11 competitive baselines with pre-tuned parameters. We plot each algorithm s speed-recall curve and show ours achieves the stateof-the-art. |
| Researcher Affiliation | Industry | 1Google Research. Correspondence to: Philip Sun <sunphil@google.com>. |
| Pseudocode | No | The paper describes iterative algorithms with numbered steps, but these are presented as descriptive text rather than formal pseudocode blocks or figures labeled 'Algorithm'. |
| Open Source Code | Yes | The proposed approach, whose implementation is open-source, Our implementation is open-source and available at https://github.com/google-research/ google-research/tree/master/scann |
| Open Datasets | Yes | We use Glove1.2M which is a collection of 1.2 million 100-dimensional word embeddings trained as described in (Pennington et al., 2014). |
| Dataset Splits | No | The paper mentions using Glove1.2M and the ANN-Benchmarks setup, which implies standard datasets and procedures, but it does not explicitly state the specific training, validation, and test dataset splits (e.g., percentages or sample counts) used in their experiments. |
| Hardware Specification | Yes | Our benchmarks are all conducted on an Intel Xeon W-2135 with a single CPU thread |
| Software Dependencies | No | The paper mentions building its implementation on specific techniques like SIMD based ADC and combining it with a vector quantization based tree, and compares against faiss and hnswlib. However, it does not specify version numbers for any of the software dependencies used in their own implementation. |
| Experiment Setup | Yes | For all subsequent experiments, we set T = 0.2, which by the limit in Equation (3) corresponds to a value of η = 4.125. |