Contrastive Quantization with Code Memory for Unsupervised Image Retrieval

Authors: Jinpeng Wang, Ziyun Zeng, Bin Chen, Tao Dai, Shu-Tao Xia2468-2476

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on benchmark datasets show that Me Co Q outperforms state-of-the-art methods. Code and configurations are publicly released. ... Experiments Setup Datasets (i) Flickr25K (Huiskes and Lew 2008) contains 25k images from 24 categories. ... Metrics We adopt the typical metric, Mean Average Precision (MAP), from previous works... Table 1: Mean Average Precision (MAP, %) results for different number of bits on Flickr25K, CIFAR-10 (I and II) and NUS-WIDE datasets.
Researcher Affiliation Academia Jinpeng Wang1,2,4, Ziyun Zeng1,4, Bin Chen2*, Tao Dai3, Shu-Tao Xia1,4 1Tsinghua Shenzhen International Graduate School, Tsinghua University 2Harbin Institute of Technology, Shenzhen 3Shenzhen University 4Research Center of Artificial Intelligence, Peng Cheng Laboratory
Pseudocode No The paper describes the learning algorithm in prose under the "Learning Algorithm" section, but does not provide structured pseudocode or an algorithm block.
Open Source Code Yes Code and configurations are publicly released.
Open Datasets Yes Datasets (i) Flickr25K (Huiskes and Lew 2008) contains 25k images from 24 categories. ... (ii) CIFAR10 (Krizhevsky and Hinton 2009) contains 60k images from 10 categories. ... (iii) NUSWIDE (Chua et al. 2009) is a large-scale image dataset containing about 270k images from 81 categories.
Dataset Splits No (i) Flickr25K (Huiskes and Lew 2008) ... randomly pick 2,000 images as the testing queries, while another 5,000 images are randomly selected from the rest of the images as the training set. (ii) CIFAR10 (Krizhevsky and Hinton 2009) ... We follow Li and van Gemert (2021) to use 1k images per class (totally 10k images) as the test query set, and the remaining 50k images are used for training. (iii) NUSWIDE (Chua et al. 2009) ... 100 images per category are randomly selected as the testing queries while the remaining images form the database and the training set.
Hardware Specification No The paper mentions "a single GPU can afford a limited batch size" and discusses "GPU Mem." in Table 3 but does not provide specific hardware details like GPU or CPU models.
Software Dependencies No We implement Me Co Q with Pytorch (Paszke et al. 2019).
Experiment Setup Yes The default hyper-parameter settings are as follows. (i) We set the batch size as 128 and the maximum epoch as 50. (ii) The queue length (i.e., the memory bank size), NM = 384. (iii) The smoothness factor of codeword assignment in Eq.(3), α = 10. (iv) The codeword number of each codebook, K = 256 such that each image is encoded by B = M log2 K = 8M bits (i.e., M bytes). (v) The positive prior, ρ+ = 0.1 for CIFAR-10 (I and II), ρ+ = 0.15 for Flickr25K and NUS-WIDE. (vi) The starting epoch for the memory module are set to 5 on Flickr25K, 10 on NUS-WIDE and 15 on CIFAR-10 (I and II).