Product Quantized Translation for Fast Nearest Neighbor Search

Authors: Yoonho Hwang, Mooyeol Baek, Saehoon Kim, Bohyung Han, Hee-Kap Ahn

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Although our framework is composed of simple operations only, it achieves the state-of-the-art performance compared to existing nearest neighbor search techniques, which is illustrated quantitatively using various large-scale benchmark datasets in different sizes and dimensions.
Researcher Affiliation Collaboration Yoonho Hwang, Mooyeol Baek, Saehoon Kim, Bohyung Han, Hee-Kap Ahn Dept. of Computer Science and Engineering POSTECH, Korea {cypher, mooyeol, kshkawa, bhhan, heekap}@postech.ac.kr S.Kim is also affiliated with AItrics.
Pseudocode Yes The pseudocode of our algorithm is presented in Algorithm 1.
Open Source Code No The paper states 'We use the source codes released by authors for the implementations of the external algorithms', but it does not provide any explicit statement or link indicating that the source code for their own proposed methodology is publicly available.
Open Datasets Yes We perform the experiments on four independent datasets, which are denoted by MNIST (Lecun et al. 1998), SIFT5M (J egou, Douze, and Schmid 2011), GIST1M (J egou, Douze, and Schmid 2011), and MSCOCO (Lin et al. 2014).
Dataset Splits No The paper states that MS-COCO 'contains 4,096-dimensional vectors, which are feature descriptors for 123,287 images in training and validation sets', but it does not specify explicit training/validation/test split percentages, sample counts, or the methodology for data partitioning needed to reproduce the experiment across all datasets or how the validation set was utilized in this specific experimental setup.
Hardware Specification Yes All tested algorithms are implemented in C++ in Linux (Fedora 21, g++ 4.9.2), specifically using a single core on Intel Core i7-5820k@3.30Ghz with 64GB main memory.
Software Dependencies Yes All tested algorithms are implemented in C++ in Linux (Fedora 21, g++ 4.9.2), specifically using a single core on Intel Core i7-5820k@3.30Ghz with 64GB main memory.
Experiment Setup Yes By default, the number of clusters and the dimensionality of partitions are set to 64 and 32, respectively.