Multiscale Quantization for Fast Similarity Search
Authors: Xiang Wu, Ruiqi Guo, Ananda Theertha Suresh, Sanjiv Kumar, Daniel N. Holtmann-Rice, David Simcha, Felix Yu
NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct comprehensive experiments on two large-scale public datasets, demonstrating substantial improvements in recall over existing state-of-the-art methods. (Abstract) Also, sections like "4 Experiments" and its subsections are dedicated to empirical evaluation. |
| Researcher Affiliation | Industry | Google Research, New York {wuxiang, guorq, theertha, sanjivk, dhr, dsimcha, felixyu}@google.com |
| Pseudocode | No | The paper describes the methodology and optimization procedure in text and mathematical equations, but it does not contain clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any explicit statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | We evaluate the performance of end-to-end trained multiscale quantization (MSQ) on the SIFT1M [20] and DEEP10M [3] datasets, which are often used in benchmarking the performance of nearest neighbor search. (Section 4.1 Evaluation Datasets) |
| Dataset Splits | No | The paper mentions "At training time" and "At query time" and uses "SIFT1M" and "DEEP10M" datasets for evaluation, which implies train/test usage. However, it does not provide explicit train/validation/test dataset split percentages, sample counts, or refer to predefined splits with citations for reproducibility. |
| Hardware Specification | No | The paper mentions "efficient GPU implementation for ADC lookup" and "VPSHUFB instruction from AVX2", but it does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for running experiments. |
| Software Dependencies | No | The paper mentions using "Adam optimization algorithm [23]" and "Python", along with hardware instructions like "VPSHUFB instruction from AVX2". However, it does not provide specific version numbers for any software libraries, programming languages, or solvers required to replicate the experiments. |
| Experiment Setup | Yes | All optimization parameters were fixed for all datasets... We used the Adam optimization algorithm [23] with the parameters suggested by the authors, minibatch sizes of 2000, and a learning rate of 1e 4 during joint training (and 1e 3 when training only the vector quantizers). (Section 2.2 Optimization Procedure) |