Fast k-Nearest Neighbour Search via Dynamic Continuous Indexing
Authors: Ke Li, Jitendra Malik
ICML 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We show appealing theoretical properties and demonstrate empirically that the proposed algorithm outperforms locality-sensitivity hashing (LSH) in terms of approximation quality, speed and space efficiency. ... We compare the performance of the proposed algorithm to that of LSH... We compare the performance of the proposed algorithm and LSH on the CIFAR-100 and MNIST datasets... |
| Researcher Affiliation | Academia | Ke Li KE.LI@EECS.BERKELEY.EDU Jitendra Malik MALIK@EECS.BERKELEY.EDU University of California, Berkeley, CA 94720, United States |
| Pseudocode | Yes | Algorithm 1 Algorithm for data structure construction; Algorithm 2 Algorithm for k-nearest neighbour retrieval |
| Open Source Code | No | The paper does not provide any explicit statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | We compare the performance of the proposed algorithm and LSH on the CIFAR-100 and MNIST datasets, which consist of 32 32 colour images of various real-world objects and 28 28 grayscale images of handwritten digits respectively. |
| Dataset Splits | No | The paper describes combining the training and test sets and then selecting query points using cross-validation. It does not mention a separate validation split for model tuning. |
| Hardware Specification | No | The paper does not provide any specific hardware details (e.g., GPU models, CPU types, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions using 'self-balancing binary search trees or skip lists' and comparing against 'Exact Euclidean LSH (E2LSH)', but it does not specify any software names with version numbers for reproducibility (e.g., Python, PyTorch, or specific library versions). |
| Experiment Setup | Yes | We adopt the recommended guidelines for choosing parameters of LSH and used 24 hashes per table and 100 tables. For the proposed method, we used m = 25 and L = 2 on CIFAR-100 and m = 15 and L = 3 on MNIST, which we found to work well in practice. |