Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Learning Nearest Neighbor Graphs from Noisy Distance Samples

Authors: Blake Mason, Ardhendu Tripathy, Robert Nowak

NeurIPS 2019 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In Section 5 we show ANNTri s empirical performance on both simulated and real data. In particular, we highlight its efficiency in learning from human judgments.
Researcher Affiliation Academia Blake Mason University of Wisconsin Madison, WI 53706 EMAIL Ardhendu Tripathy University of Wisconsin Madison, WI 53706 EMAIL Robert Nowak University of Wisconsin Madison, WI 53706 EMAIL
Pseudocode Yes Algorithm 1 ANNTri and Algorithm 2 SETri are provided with structured steps.
Open Source Code Yes Implementations of ANNTri, ANN, and RANDOM can be found alongside a demo and summary slides at https://github.com/blakemas/nngraph.
Open Datasets Yes For this experiment, we used a set X of 85 images of shoes drawn from the UT Zappos50k dataset [32, 33]
Dataset Splits No The paper mentions 'cross validation' for selecting an embedding dimension, but does not provide specific training/test/validation dataset splits (percentages, counts, or predefined standard splits) for its experiments.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory amounts, or cluster specifications) used for running its experiments.
Software Dependencies No The paper mentions general tools and algorithms (e.g., 'Python array/Matlab notation', 'STE algorithm') but does not list specific software components with their version numbers.
Experiment Setup Yes To construct the tightest possible confidence bounds for SETri, we use the law of the iterated logarithm as in [18] with parameters = 0.7 and δ = 0.1. Our analysis bounds the number of queries made to the oracle. We visualize the performance by tracking the empirical error rate with the number of queries made per point. ... All rounds were capped at 105 samples for efficiency.