Learning to Hash on Structured Data

Authors: Qifan Wang, Luo Si, Bin Shen

AAAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on two datasets clearly demonstrate the advantages of the proposed method over several state-of-the-art hashing methods.
Researcher Affiliation Academia Qifan Wang, Luo Si and Bin Shen Computer Science Department, Purdue University West Lafayette, IN 47907, US wang868@purdue.edu, lsi@purdue.edu, bshen@purdue.edu
Pseudocode Yes Algorithm 1 Hashing on Structured Data (HSD)
Open Source Code No The paper does not provide any specific links or statements about the availability of its source code.
Open Datasets Yes Web KB2 contains 8280 webpages in total collected from four universities. The webpages without any incoming and outgoing links are deleted, resulting in a subset of 6883 webpages. The tf-idf (normalized term frequency and log inverse document frequency) (Manning, Raghavan, and Sch utze 2008) features are extracted for each webpage. NUS-WIDE3 (Chua et al. 2009) is created by NUS lab for evaluating image annotation and retrieval techniques. These datasets are referenced with footnotes containing URLs: '2http://www.cs.cmu.edu/ Web KB' and '3http://lms.comp.nus.edu.sg/research/NUS-WIDE.htm'.
Dataset Splits Yes The parameter α and β are tuned by 5-fold cross validation through the grid {0.01, 0.1, 1, 10, 100} on the training set and we will discuss more details on how it affects the performance of our approach later.
Hardware Specification Yes We implement our algorithm using Matlab on a PC with Intel Duo Core i5-2400 CPU 3.1GHz and 8GB RAM.
Software Dependencies No The paper mentions 'Matlab' but does not specify a version number or any other software dependencies with version numbers.
Experiment Setup Yes The parameter α and β are tuned by 5-fold cross validation through the grid {0.01, 0.1, 1, 10, 100} on the training set and we will discuss more details on how it affects the performance of our approach later.