Graph-based Nearest Neighbor Search: From Practice to Theory

Authors: Liudmila Prokhorenkova, Aleksandr Shekhovtsov

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We believe that our theoretical insights supported by experimental analysis are an important step towards understanding the limits and benefits of graph-based NNS algorithms. Finally, in Section 5, we empirically illustrate the obtained results, discuss the most promising techniques, and demonstrate why our assumptions are reasonable.
Researcher Affiliation Collaboration 1Yandex, Moscow, Russia 2Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region, Russia 3Higher School of Economics, Moscow, Russia.
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks with explicit labels like "Pseudocode" or "Algorithm".
Open Source Code Yes The source code of these experiments is publicly available.7 https://github.com/Shekhale/gbnns_theory. The source code for reproducing these experiments is publicly available.8 https://github.com/Shekhale/gbnns_dim_ red
Open Datasets Yes We also experimented with the widely used real-world datasets: SIFT and GIST (Jegou et al., 2010), Glo Ve (Pennington et al., 2014), and DEEP1B (Babenko & Lempitsky, 2016).
Dataset Splits No The paper lists the number of base and query samples for real datasets, which implicitly serves as a train/test split for NNS problems. However, it does not explicitly state training, validation, and testing splits using percentages, exact counts for each split, or references to predefined standard splits with citations, as requested.
Hardware Specification Yes All algorithms were run on a single core of Intel(R) Core(TM) i7-6800K CPU @ 3.40GHz.
Software Dependencies No The paper mentions that source code is publicly available via GitHub links, implying the use of programming languages and libraries. However, it does not provide specific version numbers for any software components or dependencies within the main text or its footnotes.
Experiment Setup No The paper describes varying parameters like the number of local neighbors and candidates for beam search to generate Time-vs-Quality curves. However, it does not provide specific numerical values for hyperparameters or other system-level training settings in the main text.