Dual Semantic Fusion Hashing for Multi-Label Cross-Modal Retrieval

Authors: Kaiming Liu, Yunhong Gong, Yu Cao, Zhenwen Ren, Dezhong Peng, Yuan Sun

IJCAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on three benchmarks demonstrate the superior performance of our DSFH compared with 16 state-of-the-art methods.
Researcher Affiliation Academia Kaiming Liu1 , Yunhong Gong1 , Yu Cao1 , Zhenwen Ren2 , Dezhong Peng1,3 and Yuan Sun1; 1College of Computer Science, Sichuan University 2School of National Defense Science and Technology, Southwest University of Science and Technology 3National Innovation Center for UHD Video Technology
Pseudocode No The paper describes the optimization steps in text and mathematical equations, but no formal pseudocode block or algorithm box is provided.
Open Source Code No The paper does not contain an explicit statement about releasing source code or a link to a code repository.
Open Datasets Yes To evaluate the effectiveness of the proposed DSFH, we conduct numerous experiments on three benchmarks. MIRFlickr-25K [Huiskes and Lew, 2008] has 25,000 image-text pairs sourced from the Flickr website... IAPRTC12 [Escalante et al., 2010] consists of 20,000 geographical images belonging to 255 categories... NUS-WIDE [Chua et al., 2009] contains 269,648 image-text pairs...
Dataset Splits Yes In our experiments, we select the instances associated with a minimum of 20 textual labels, resulting in 20,015 instances. Further, we randomly choose 2,000 instances as the query set, while the remaining image-text pairs constitute the training set.
Hardware Specification No To comprehensively evaluate the retrieval performance, we conduct extensive experiments on a Windows server equipped with 64GB of RAM.
Software Dependencies No No specific software dependencies with version numbers (e.g., Python 3.8, PyTorch 1.9) were listed in the paper.
Experiment Setup Yes The number of anchors for RBF is set to 1500, and the maximum iteration step is set to 10. The hyper-parameters α and λ are set to {10 3, 10 3}, {10 4, 10 3}, and {10 4, 10 4} for MIRFlickr-25K, IAPRTC12, and NUS-WIDE, respectively. In addition, the number of clusters k is 400, 300, and 400 for three datasets, respectively.