Pairwise Relationship Guided Deep Hashing for Cross-Modal Retrieval

Authors: Erkun Yang, Cheng Deng, Wei Liu, Xianglong Liu, Dacheng Tao, Xinbo Gao

AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments show that our proposed method yields state-of-the-art results on two cross-modal retrieval datasets.
Researcher Affiliation Collaboration 1 School of Electronic Engineering, Xidian University, Xi an 710071, China 2 Tencent AI Lab, Shenzhen, China 3 Beihang University, Beijing 100191, China 4 Centre for Artificial Intelligence, University of Technology Sydney, NSW 2007, Australia
Pseudocode Yes We summarize the whole alternating learning procedure in Algorithm 1.
Open Source Code No The paper states: 'Source codes of most baselines are kindly provided by the authors, except for DCMH, CMFH and CCA.', but does not provide concrete access to the source code for the proposed PRDH method.
Open Datasets Yes MIRFlickr (Huiskes and Lew 2008): It originally consists of 25,000 instances... NUS-WIDE (Chua et al. 2009): It is a real-world web image database...
Dataset Splits Yes For MIRFlickr, we take 2000 instances as the test set, and the rest as the retrieval set. To reduce computational costs, the training set include 5000 instances which are randomly sampled from the retrieval set. For NUS-WIDE, we take 1% of the dataset as the test set and the remaining as the retrieval set. We also randomly sampled 5000 instances from the retrieval set to construct the training set. We use a validation set to choose the hyperparameter λ and γ.
Hardware Specification No The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types, or memory amounts) used for running its experiments.
Software Dependencies No The paper mentions using the VGG-F network and Multilayer Perceptrons (MLP) as architectural components, but does not provide specific software dependencies with version numbers (e.g., library or framework names with their versions).
Experiment Setup Yes According to the results in the validation set, we set λ = γ = 1 in our experiments. The batch size is fixed to be 128 and the iteration number of the outer-loop in Algorithm 1 is set to be 1000. The first seven layers of the CNN module for image modality are fined-tuned from the VGG-F model, the new fch layer and the multilayer perceptrons from textual modality are jointly trained with mini-batch SGD method.