A Greedy Approach for Budgeted Maximum Inner Product Search

Authors: Hsiang-Fu Yu, Cho-Jui Hsieh, Qi Lei, Inderjit S. Dhillon

NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we perform extensive empirical comparisons to compare Greedy-MIPS with other state-of-the-art fast MIPS approaches on both real-world and synthetic datasets:
Researcher Affiliation Collaboration Hsiang-Fu Yu Amazon Inc. rofuyu@cs.utexas.edu Cho-Jui Hsieh University of California, Davis chohsieh@ucdavis.edu Qi Lei The University of Texas at Austin leiqi@ices.utexas.edu Inderjit S. Dhillon The University of Texas at Austin inderjit@cs.utexas.edu
Pseudocode Yes Algorithm 1 Cond Iter: an iterator over j 2 {1, . . . , n} based on the conditional ranking t(j|w). This code assumes that the k sorted index arrays st[r], r=1, . . . , n, t=1, . . . , k are available. class Cond Iter: def constructor(dim_idx, query_val): t, w, ptr dim_idx, query_val, 1 def current(): st[ptr] if w > 0, st[n ptr + 1] otherwise. def has Next(): return (ptr < n) def get Next(): ptr ptr + 1 and return current()
Open Source Code No The paper does not provide any explicit statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets Yes We use netflix and yahoo-music as our real-world recommender system datasets. There are 17, 770 and 624, 961 items in netflix and yahoo-music, respectively. In particular, we obtain the user embeddings {wi} 2 Rk and item embeddings hj 2 Rk by the standard low-rank matrix factorization [4] with k 2 {50, 200}.
Dataset Splits No The paper mentions using real-world and synthetic datasets and evaluating on 'a randomly selected 2,000 query vectors' but does not provide specific training, validation, or test dataset split percentages or sample counts.
Hardware Specification No The paper mentions that 'all the compared approaches are implemented in C++' and discusses runtime, but it does not specify any hardware details such as CPU/GPU models or memory used for the experiments.
Software Dependencies No The paper states that 'all the compared approaches are implemented in C++', but it does not specify any ancillary software dependencies with version numbers (e.g., specific libraries, frameworks, or compilers).
Experiment Setup No The paper mentions varying parameters like 'depth of PCA tree' for PCA-MIPS, 'values (a, b)' for LSH-MIPS, and 'number of samples' for Diamond-MSIPS to control trade-offs, but it does not provide specific hyperparameter values or detailed system-level training configurations (e.g., learning rates, batch sizes, optimizers) for its own method or the baselines.