RHash: Robust Hashing via L_infinity-norm Distortion
Authors: Amirali Aghazadeh, Andrew Lan, Anshumali Shrivastava, Richard Baraniuk
IJCAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | A range of experimental evaluations demonstrate the superiority of RHash over ten state-of-the-art binary hashing schemes. Fourth, experimentally, we demonstrate the superior performance of RHash over ten state-of-the-art binary hashing algorithms using an exhaustive set of experimental evaluations involving six diverse datasets and three different performance metrics (distance preservation, Hamming distance ranking, and Kendall s τ ranking performance). |
| Researcher Affiliation | Collaboration | 1Rice University, Houston, TX, USA 2Princeton University, Princeton, NJ, USA {amirali,anshumali,richb}@rice.edu, andrew.lan@princeton.edu |
| Pseudocode | No | More specifically, in iteration ℓ, we perform the following four steps until convergence: (This section describes the algorithm steps in prose rather than a structured pseudocode block or algorithm figure.) |
| Open Source Code | No | No statement explicitly providing open-source code for the methodology, nor any links to a repository, were found. |
| Open Datasets | Yes | MNIST is a collection of 60,000 28 28 greyscale images of handwritten digits [Le Cun and Cortes, 1998]. Photo-Tourism is a corpus of approximately 300,000 image patches represented using scale-invariant feature transform (SIFT) features [Lowe, 2004] in R128 [Snavely et al., 2006]. Label Me is a collection of over 20,000 images represented using GIST descriptors in R512 [Torralba et al., 2008]. Peekaboom is a collection of 60,000 images represented using GIST descriptors in R512 [Torralba et al., 2008]. |
| Dataset Splits | No | We randomly select Q = 100 data points from the Random, Translating squares, and MNIST datasets. We then apply the RHash, RHash-CG, and all of the baseline algorithms on each dataset for different target binary code word lengths M from 1 to 70 bits. We then randomly select a separate set of Q = 1000 data points and use it to test the performance of RHash-CG and other baseline algorithms in terms MAP with k = 50. The paper mentions training and testing sets, but no explicit validation set for hyperparameter tuning with specific split percentages or counts. |
| Hardware Specification | No | BRE fails to execute on a standard desktop PC with 12 GB of RAM due to the size of the secant set. This refers to hardware on which a baseline algorithm *failed* to run, not the specific hardware used for the reported successful experiments, and it lacks detailed specifications like CPU/GPU models. |
| Software Dependencies | No | No specific software dependencies with version numbers (e.g., library names like PyTorch, TensorFlow with their versions) are mentioned. |
| Experiment Setup | Yes | We set the RHash and RHash-CG algorithm parameters to the common choice of ρ = 1 and η = 1.6. We follow the continuation approach [Wen et al., 2010] to set the value of α. We start with a small value of α (e.g., α = 1), in order to avoid becoming stuck in bad local minima, and then gradually increase α as the algorithm proceeds. As the algorithm moves closer to convergence and has obtained a reasonably good estimate of the parameters W and λ, we set α = 10, which enforces a good approximation of the sign function (see Lem. 2 in the appendix A for an analysis of the accuracy of this approximation). |