Ranking Preserving Hashing for Fast Similarity Search

Authors: Qifan Wang, Zhiwei Zhang, Luo Si

IJCAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental An extensive set of experiments on two large scale datasets demonstrate the superior ranking performance of the proposed approach over several state-of-the-art hashing methods.
Researcher Affiliation Academia Qifan Wang, Zhiwei Zhang and Luo Si Computer Science Department, Purdue University West Lafayette, IN 47907, US wang868@purdue.edu, zhan1187@purdue.edu, lsi@purdue.edu
Pseudocode Yes Algorithm 1 Ranking Preserving Hashing (RPH) Input: Training examples X, query examples Q and parameters α. Output: Hashing function W and hashing codes C. 1: Compute the relevance vector rj i in Eqn.1. 2: Initialize W . 3: repeat Gradient Descent 4: Compute the gradient in Eqn.12. 5: Compute the gradient in Eqn.13. 6: Update W by optimizing the objective function. 7: until the solution converges 8: Compute the hashing codes C using Eqn.2.
Open Source Code No The paper does not provide any concrete access to source code (e.g., a specific repository link, an explicit code release statement, or code in supplementary materials) for the methodology described.
Open Datasets Yes NUSWIDE [Chua et al., 2009] is created by NUS lab for evaluating image retrieval techniques.
Dataset Splits No For each experiment, we randomly choose 1k examples as testing queries. Within the remaining data examples, we randomly sample 500 training queries and for each query, we randomly sample 1000 data examples to construct the ground-truth relevance list. This describes training and testing query sets, but does not specify a distinct validation set split from the main dataset for general model training/evaluation, beyond the use of cross-validation for parameter tuning.
Hardware Specification Yes We implement our algorithm using Matlab on a PC with Intel Duo Core i5-2400 CPU 3.1GHz and 8GB RAM.
Software Dependencies No The paper states 'We implement our algorithm using Matlab' but does not provide a specific version number for Matlab or any other software dependencies.
Experiment Setup Yes The parameter α is tuned by cross validation through the grid {0.01, 0.1, 1, 10, 100} and For each experiment, we randomly choose 1k examples as testing queries. Within the remaining data examples, we randomly sample 500 training queries and for each query, we randomly sample 1000 data examples to construct the ground-truth relevance list. and Finally, we repeat each experiment 10 times and report the result based on the average over the 10 runs.