reproducibilityindex.ai

Multiscale Quantization for Fast Similarity Search

Authors: Xiang Wu, Ruiqi Guo, Ananda Theertha Suresh, Sanjiv Kumar, Daniel N. Holtmann-Rice, David Simcha, Felix Yu

NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct comprehensive experiments on two large-scale public datasets, demonstrating substantial improvements in recall over existing state-of-the-art methods. (Abstract) Also, sections like "4 Experiments" and its subsections are dedicated to empirical evaluation.
Researcher Affiliation	Industry	Google Research, New York {wuxiang, guorq, theertha, sanjivk, dhr, dsimcha, felixyu}@google.com
Pseudocode	No	The paper describes the methodology and optimization procedure in text and mathematical equations, but it does not contain clearly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any explicit statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets	Yes	We evaluate the performance of end-to-end trained multiscale quantization (MSQ) on the SIFT1M [20] and DEEP10M [3] datasets, which are often used in benchmarking the performance of nearest neighbor search. (Section 4.1 Evaluation Datasets)
Dataset Splits	No	The paper mentions "At training time" and "At query time" and uses "SIFT1M" and "DEEP10M" datasets for evaluation, which implies train/test usage. However, it does not provide explicit train/validation/test dataset split percentages, sample counts, or refer to predefined splits with citations for reproducibility.
Hardware Specification	No	The paper mentions "efﬁcient GPU implementation for ADC lookup" and "VPSHUFB instruction from AVX2", but it does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for running experiments.
Software Dependencies	No	The paper mentions using "Adam optimization algorithm [23]" and "Python", along with hardware instructions like "VPSHUFB instruction from AVX2". However, it does not provide specific version numbers for any software libraries, programming languages, or solvers required to replicate the experiments.
Experiment Setup	Yes	All optimization parameters were ﬁxed for all datasets... We used the Adam optimization algorithm [23] with the parameters suggested by the authors, minibatch sizes of 2000, and a learning rate of 1e 4 during joint training (and 1e 3 when training only the vector quantizers). (Section 2.2 Optimization Procedure)