reproducibilityindex.ai

Weakly Supervised Deep Hyperspherical Quantization for Image Retrieval

Authors: Jinpeng Wang, Bin Chen, Qiang Zhang, Zaiqiao Meng, Shangsong Liang, Shutao Xia2755-2763

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments show that WSDHQ can achieve state-of-art performance on weakly-supervised compact coding. Extensive experiments show that WSDHQ yields stateof-art retrieval results in weakly-supervised scenario. We conduct extensive experiments to evaluate our proposed WSDHQ model with several state-of-art shallow and deep hashing methods on two web image datasets. The MAP results of all methods are reported in Table 1, which shows that the proposed WSDHQ model substantially outperforms all the comparison methods.
Researcher Affiliation	Academia	1Tsinghua Shenzhen International Graduate School, Tsinghua University 2School of Computer Science and Engineering, Sun Yat-sen University 3University College London 4University of Cambridge
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks. It describes the algorithms and optimization steps in paragraph form.
Open Source Code	No	The paper does not provide concrete access to source code for the methodology described. It does not include a repository link or an explicit statement of code release.
Open Datasets	Yes	MIR-FLICKR25K (Huiskes and Lew 2008) is a dataset of 25,000 Flickr images associated with 1,386 tags. NUS-WIDE (Chua et al. 2009) is a large-scale web image dataset also collected from Flickr, which contains 269,648 images with 5,018 tags provided by users.
Dataset Splits	Yes	2,000 images are randomly sampled as test queries and the rest are used as retrieval database and training images. We collect a subset of 193,752 images with the 21 most frequent labels for experiments. We follow (Cao et al. 2017; Liu et al. 2018) to randomly sample 5,000 images as queries and remain the rest as the database, from which we further sample 10,000 images and their tag sets as training data.
Hardware Specification	No	The paper does not explicitly describe the hardware used to run its experiments. It mentions using "a standard CNN f" and "Alex Net" as the backbone network, but no specific hardware details.
Software Dependencies	Yes	We implement WSDHQ based on Tensor Flow (Abadi et al. 2016). We take the Word2Vec (Mikolov et al. 2013) as word embedding model and represent each tag with a 300dimensional pre-trained embedding. We adopt a mini-batch Adam with default parameters as optimizer.
Experiment Setup	Yes	For the semantic correlation graph, we set the maximum number of neighbors k = 20 for each tag, the correlation threshold τ as 0.75 and the merging threshold ϵ as 0.1. We set the number of tags for negative tags selected in the adaptive cosine margin loss as Kn = 1000. We ﬁne-tune all layers copied from pre-trained model and train the transform layer via back-propagation from scratch. We adopt a mini-batch Adam with default parameters as optimizer. Besides, we select learning rate from 10 5 10 2, the hyper-parameter λ from 10 5 10 1 and γ from [0.3, 0.5, 0.7, 1, 2, 3, 4] via cross-validation. Following (Cao et al. 2016, 2017; Liu et al. 2018; Eghbali and Tahvildari 2019), we adopt K = 256 codewords for each codebook, thus the binary index for each image of all M codebooks requires B = M log2 K = 8M bits (i.e., M bytes).