RHash: Robust Hashing via L_infinity-norm Distortion

Authors: Amirali Aghazadeh, Andrew Lan, Anshumali Shrivastava, Richard Baraniuk

IJCAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental A range of experimental evaluations demonstrate the superiority of RHash over ten state-of-the-art binary hashing schemes. Fourth, experimentally, we demonstrate the superior performance of RHash over ten state-of-the-art binary hashing algorithms using an exhaustive set of experimental evaluations involving six diverse datasets and three different performance metrics (distance preservation, Hamming distance ranking, and Kendall s τ ranking performance).
Researcher Affiliation Collaboration 1Rice University, Houston, TX, USA 2Princeton University, Princeton, NJ, USA {amirali,anshumali,richb}@rice.edu, andrew.lan@princeton.edu
Pseudocode No More specifically, in iteration ℓ, we perform the following four steps until convergence: (This section describes the algorithm steps in prose rather than a structured pseudocode block or algorithm figure.)
Open Source Code No No statement explicitly providing open-source code for the methodology, nor any links to a repository, were found.
Open Datasets Yes MNIST is a collection of 60,000 28 28 greyscale images of handwritten digits [Le Cun and Cortes, 1998]. Photo-Tourism is a corpus of approximately 300,000 image patches represented using scale-invariant feature transform (SIFT) features [Lowe, 2004] in R128 [Snavely et al., 2006]. Label Me is a collection of over 20,000 images represented using GIST descriptors in R512 [Torralba et al., 2008]. Peekaboom is a collection of 60,000 images represented using GIST descriptors in R512 [Torralba et al., 2008].
Dataset Splits No We randomly select Q = 100 data points from the Random, Translating squares, and MNIST datasets. We then apply the RHash, RHash-CG, and all of the baseline algorithms on each dataset for different target binary code word lengths M from 1 to 70 bits. We then randomly select a separate set of Q = 1000 data points and use it to test the performance of RHash-CG and other baseline algorithms in terms MAP with k = 50. The paper mentions training and testing sets, but no explicit validation set for hyperparameter tuning with specific split percentages or counts.
Hardware Specification No BRE fails to execute on a standard desktop PC with 12 GB of RAM due to the size of the secant set. This refers to hardware on which a baseline algorithm *failed* to run, not the specific hardware used for the reported successful experiments, and it lacks detailed specifications like CPU/GPU models.
Software Dependencies No No specific software dependencies with version numbers (e.g., library names like PyTorch, TensorFlow with their versions) are mentioned.
Experiment Setup Yes We set the RHash and RHash-CG algorithm parameters to the common choice of ρ = 1 and η = 1.6. We follow the continuation approach [Wen et al., 2010] to set the value of α. We start with a small value of α (e.g., α = 1), in order to avoid becoming stuck in bad local minima, and then gradually increase α as the algorithm proceeds. As the algorithm moves closer to convergence and has obtained a reasonably good estimate of the parameters W and λ, we set α = 10, which enforces a good approximation of the sign function (see Lem. 2 in the appendix A for an analysis of the accuracy of this approximation).