reproducibilityindex.ai

A Greedy Approach for Budgeted Maximum Inner Product Search

Authors: Hsiang-Fu Yu, Cho-Jui Hsieh, Qi Lei, Inderjit S. Dhillon

NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we perform extensive empirical comparisons to compare Greedy-MIPS with other state-of-the-art fast MIPS approaches on both real-world and synthetic datasets:
Researcher Affiliation	Collaboration	Hsiang-Fu Yu Amazon Inc. rofuyu@cs.utexas.edu Cho-Jui Hsieh University of California, Davis chohsieh@ucdavis.edu Qi Lei The University of Texas at Austin leiqi@ices.utexas.edu Inderjit S. Dhillon The University of Texas at Austin inderjit@cs.utexas.edu
Pseudocode	Yes	Algorithm 1 Cond Iter: an iterator over j 2 {1, . . . , n} based on the conditional ranking t(j\|w). This code assumes that the k sorted index arrays st[r], r=1, . . . , n, t=1, . . . , k are available. class Cond Iter: def constructor(dim_idx, query_val): t, w, ptr dim_idx, query_val, 1 def current(): st[ptr] if w > 0, st[n ptr + 1] otherwise. def has Next(): return (ptr < n) def get Next(): ptr ptr + 1 and return current()
Open Source Code	No	The paper does not provide any explicit statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets	Yes	We use netﬂix and yahoo-music as our real-world recommender system datasets. There are 17, 770 and 624, 961 items in netﬂix and yahoo-music, respectively. In particular, we obtain the user embeddings {wi} 2 Rk and item embeddings hj 2 Rk by the standard low-rank matrix factorization [4] with k 2 {50, 200}.
Dataset Splits	No	The paper mentions using real-world and synthetic datasets and evaluating on 'a randomly selected 2,000 query vectors' but does not provide specific training, validation, or test dataset split percentages or sample counts.
Hardware Specification	No	The paper mentions that 'all the compared approaches are implemented in C++' and discusses runtime, but it does not specify any hardware details such as CPU/GPU models or memory used for the experiments.
Software Dependencies	No	The paper states that 'all the compared approaches are implemented in C++', but it does not specify any ancillary software dependencies with version numbers (e.g., specific libraries, frameworks, or compilers).
Experiment Setup	No	The paper mentions varying parameters like 'depth of PCA tree' for PCA-MIPS, 'values (a, b)' for LSH-MIPS, and 'number of samples' for Diamond-MSIPS to control trade-offs, but it does not provide specific hyperparameter values or detailed system-level training configurations (e.g., learning rates, batch sizes, optimizers) for its own method or the baselines.