Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Non-metric Similarity Graphs for Maximum Inner Product Search

Authors: Stanislav Morozov, Artem Babenko

NeurIPS 2018 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	By an extensive comparison to the existing approaches, we show that the proposed method is a game-changer in terms of the runtime/accuracy trade-off for the MIPS problem. We introduce a new large-scale dataset for the MIPS algorithms evaluation to facilitate research in this direction.
Researcher Affiliation	Collaboration	Stanislav Morozov Yandex, Lomonosov Moscow State University EMAIL Artem Babenko Yandex, National Research University Higher School of Economics EMAIL
Pseudocode	Yes	Algorithm 1 Greedy walk; Algorithm 2 NSW graph construction
Open Source Code	Yes	The dataset and the C++ implementation of our method are available online1. Footnote 1: https://github.com/stanis-morozov/ip-nsw
Open Datasets	Yes	Datasets. We summarize the information on the benchmark datasets in Table 1. The Netﬂix, Movie Lens and Yahoo!Music datasets are the established benchmarks for the MIPS problem. Music100 is a new dataset that we introduce to the community2. This dataset was obtained by IALSfactorization[26]... Footnote 2: https://github.com/stanis-morozov/ip-nsw
Dataset Splits	No	The paper mentions training, testing, and evaluation metrics but does not provide specific details on validation dataset splits, percentages, or methods for creating them.
Hardware Specification	Yes	All the experiments were performed on Intel Xeon E5-2650 machine in a single thread mode.
Software Dependencies	No	The paper mentions using the 'Intel MKL library' and provides a 'C++ implementation' but does not specify version numbers for any software dependencies or libraries.
Experiment Setup	Yes	The primary parameter of the NSW is the maximum vertex degree M... In our experiments we use M = 32 edges per vertex. We performed the experiments with K = 1 and K = 10. The recall values were averaged over 10, 000 randomly sampled queries.