Learning Space Partitions for Nearest Neighbor Search

Authors: Yihe Dong, Piotr Indyk, Ilya Razenshteyn, Tal Wagner

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental On several standard benchmarks for NNS (Aum uller et al., 2017), our experiments show that the partitions obtained by Neural LSH consistently outperform partitions found by quantization-based and tree-based methods as well as classic, data-oblivious LSH.
Researcher Affiliation Collaboration Yihe Dong Microsoft Piotr Indyk MIT Ilya Razenshteyn Microsoft Research Tal Wagner MIT
Pseudocode Yes Algorithm 1: Nearest neighbor search with a learned space partition
Open Source Code No The paper mentions using an 'open-source solver Ka HIP' but does not provide a link or statement about the availability of their own source code for the proposed method.
Open Datasets Yes For the experimental evaluation, we use three standard ANN benchmarks (Aum uller et al., 2017): SIFT (image descriptors, 1M 128-dimensional points), Glo Ve (word embeddings (Pennington et al., 2014), approximately 1.2M 100-dimensional points, normalized), and MNIST (images of digits, 60K 784-dimensional points).
Dataset Splits No The paper mentions using the dataset P as a 'training set' and 'query points' for evaluation, but it does not explicitly describe a separate validation dataset split with specific percentages or counts.
Hardware Specification No The paper refers to general hardware like 'GPUs' but does not provide specific models or configurations (e.g., 'NVIDIA A100', 'Intel Xeon') used for running the experiments.
Software Dependencies No The paper mentions software components like 'Adam optimizer' and 'Ka HIP' but does not provide specific version numbers for them or any other software dependencies.
Experiment Setup Yes Both architectures consist of several blocks, where each block is a fully-connected layer + batch normalization (Ioffe & Szegedy, 2015) + Re LU activations. In the top-level network we use b = 3 and s = 512. In the second-level networks we use b = 2 and s = 390. To reduce overfitting, we use dropout with probability 0.1 during training. The networks are trained using the Adam optimizer (Kingma & Ba, 2015) for under 20 epochs on both levels. We reduce the learning rate multiplicatively at regular intervals.