reproducibilityindex.ai

Scalable Nearest Neighbor Search for Optimal Transport

Authors: Arturs Backurs, Yihe Dong, Piotr Indyk, Ilya Razenshteyn, Tal Wagner

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We perform extensive experimental evaluation of nearest neighbor search algorithms in the W1 distance on realworld dataset. Our results show that compared to previous state of the art, Flowtree achieves up to 7.4 times faster running time. ... In this section we empirically evaluate Flowtree and compare it to various existing methods.
Researcher Affiliation	Collaboration	1Author names are ordered alphabetically. Code available at https://github.com/ilyaraz/ot_estimators. 2TTIC 3Microsoft 4MIT 5Microsoft Research.
Pseudocode	No	The paper describes algorithms in prose but does not include structured pseudocode or clearly labeled algorithm blocks.
Open Source Code	Yes	Code available at https://github.com/ilyaraz/ot_estimators.
Open Datasets	Yes	We use the standard benchmark 20news dataset of news-related online discussion groups, and a dataset of Amazon reviews... We use the MNIST dataset of handwritten digits.
Dataset Splits	No	The paper mentions tuning parameters on 'a random subset of 300 queries' but does not provide explicit training, validation, and test dataset splits with percentages or sample counts for model training.
Hardware Specification	Yes	All running times are measured on a Standard F72s v2 Microsoft Azure instance equipped with Intel Xeon Platinum 8168 CPU.
Software Dependencies	No	The paper mentions software like Num Py, Open BLAS, Python, C++, Eigen, POT, and Lemon graph library but does not provide specific version numbers for any of them.
Experiment Setup	No	A pipeline with ℓalgorithms has parameters c1, . . . , cℓ 1, where ci is the number of output candidates (non-pruned points) of the ith algorithm in the pipeline. ... The optimal parameters are listed in the appendix.