Sublinear Time Nearest Neighbor Search over Generalized Weighted Space

Authors: Yifan Lei, Qiang Huang, Mohan Kankanhalli, Anthony Tung

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Evaluations over three real datasets demonstrate the superior performance of the two proposed schemes. In this section, we study the performance of SL-ALSH and S2-ALSH for NNS over dw on three real-life datasets, i.e., Mnist, Sift, and Movie Lens Full3 (or simply Movie Lens).
Researcher Affiliation Academia 1School of Computing, National University of Singapore, Singapore. Correspondence to: Qiang Huang <huangq@comp.nus.edu.sg>, Anthony K. H. Tung <atung@comp.nus.edu.sg>.
Pseudocode No The paper describes the proposed methods and their mathematical formulations (e.g., in Section 4), but does not include any structured pseudocode or algorithm blocks.
Open Source Code No The paper does not contain any explicit statements about the release of source code for the described methodology, nor does it provide any links to code repositories.
Open Datasets Yes In this section, we study the performance of SL-ALSH and S2-ALSH for NNS over dw on three real-life datasets, i.e., Mnist,1 Sift,2 and Movie Lens Full3 (or simply Movie Lens). For Mnist and Sift, we randomly sample 1,000 objects from their test sets as queries. ... 1http://yann.lecun.com/exdb/mnist/ 2http://corpus-texmex.irisa.fr/ 3https://grouplens.org/datasets/movielens/
Dataset Splits No The paper mentions using ‘test sets as queries’ for Mnist and Sift, and for Movie Lens, states ‘we randomly sample 1,000 vectors from item vectors as queries and use the rest item vectors as dataset.’ It does not explicitly define or provide details for training, validation, or specific test splits (e.g., percentages or counts) needed to reproduce data partitioning for the entire experimental process.
Hardware Specification No The paper does not provide any specific details regarding the hardware (e.g., GPU/CPU models, memory specifications) used to run the experiments.
Software Dependencies No The paper does not provide any specific software dependency details, such as library names with version numbers, that would be needed to replicate the experiment environment.
Experiment Setup Yes Based on the above results, we use the settings of U = π and K = 256 for both schemes in the subsequent experiments. We set the bucket width r to be 56, 23, and 3 for Mnist, Sift, and Movie Lens, respectively, to achieve their best results.