reproducibilityindex.ai

Collective Deep Quantization for Efficient Cross-Modal Retrieval

Authors: Yue Cao, Mingsheng Long, Jianmin Wang, Shichen Liu

AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments show that CDQ yields state of the art cross-modal retrieval results on standard benchmarks.
Researcher Affiliation	Academia	Yue Cao, Mingsheng Long, Jianmin Wang, Shichen Liu KLiss, MOE; TNList; School of Software, Tsinghua University, Beijing, China {caoyue10,liushichen95}@gmail.com {mingsheng,jimwang}@tsinghua.edu.cn
Pseudocode	No	The paper describes algorithms but does not include a figure, block, or section labeled 'Pseudocode' or 'Algorithm'.
Open Source Code	No	The paper does not provide an explicit statement or link to its open-source code.
Open Datasets	Yes	NUS-WIDE (Chua et al. 2009) is a public web image dataset. MIRFlickr (Huiskes and Lew 2008) consists of 25,000 images collected from the Flickr website.
Dataset Splits	Yes	In NUS-WIDE, we randomly select 100 pairs per class as the query set, 500 pairs per class as the training set and 50 pairs per class as the validation set. In MIR-Flickr, we randomly select 1000 pairs as the query set, 4000 pairs as the training set and 1000 pairs as the validation set.
Hardware Specification	No	The paper does not explicitly describe the specific hardware used to run its experiments, such as GPU or CPU models.
Software Dependencies	No	The paper mentions 'Tensor Flow' but does not provide specific version numbers for it or any other software dependencies.
Experiment Setup	Yes	We use mini-batch SGD with 0.9 momentum, ﬁx mini-batch size as 64, and cross-validate the learning rate. We follow similar strategy in (Long et al. 2016): (1) set the dimension of bottleneck layer D = 128 such that the composite quantizer can quantize the bottleneck representations accurately; (2) set K = 256 codewords for each codebook; (3) for each data point, the binary code of all M subspaces requires B = M log2 K = 8M bits (i.e. M bytes) for compact coding, where we set M = B/8 as B is known.