reproducibilityindex.ai

Binary Generative Adversarial Networks for Image Retrieval

Authors: Jingkuan Song, Tao He, Lianli Gao, Xing Xu, Alan Hanjalic, Heng Tao Shen

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results on standard datasets (CIFAR-10, NUSWIDE, and Flickr) demonstrate that our BGAN signiﬁcantly outperforms existing hashing methods by up to 107% in terms of m AP (See Table 2)1. We evaluate our BGAN on the task of large-scale image retrieval. Speciﬁcally, the experiments are designed to study the following research questions of our algorithm: RQ1: How does each component of our algorithm affect the performance? RQ2: Do the binary codes computed directly without relaxation improve the performance of the relaxed resolution? RQ3: Does the performance of BGAN signiﬁcantly outperform the state-of-the-art hashing algorithms? RQ4: What is the efﬁciency of BGAN?
Researcher Affiliation	Academia	Jingkuan Song,1 Tao He,1 Lianli Gao,1 Xing Xu,1 Alan Hanjalic,2 Heng Tao Shen1 1Center for Future Media and School of Computer Science and Engineering, University of Electronic Science and Technology of China. 2Delft University of Technology, Netherlands. {jingkuan.song, hetaoconquer}@gmail.com, {lianli.gao,xing.xu}@uestc.edu.cn, a.hanjalic@tudelft.nl, shenhengtao@hotmail.com
Pseudocode	No	The paper describes the system architecture and training process, but it does not include a clearly labeled "Pseudocode" or "Algorithm" block.
Open Source Code	Yes	1Our code: https://github.com/htconquer/BGAN
Open Datasets	Yes	We conduct empirical evaluation on three public benchmark datasets, CIFAR-10, NUS-WIDE, and Flickr. CIFAR-10 labeled subsets of the 80 million tiny images dataset, which consists of 60,000 32x32 color images in 10 classes, with 6,000 images per class. NUS-WIDE is a web image dataset containing 269,648 images downloaded from Flickr. Tagging ground-truth for 81 semantic concepts is provided for evaluation. We follow the settings in (Zhu et al. 2016) and use the subset of 195,834 images from the 21 most frequent concepts, where each concept consists of at least 5,000 images. Flickr is a collection of about 25,000 images from Flickr, where each image is labeled with one of the 38 concepts.
Dataset Splits	No	In NUS-WIDE and CIFAR-10, we randomly select 100 images per class as the test query set, and 1,000 images per class as the training set. In Flickr, we randomly select 1,000 images as the test query set, and 4,000 images for training. The paper specifies training and test sets but does not explicitly mention a separate validation set.
Hardware Specification	No	The paper mentions training and testing times but does not provide specific hardware details such as CPU/GPU models or memory specifications.
Software Dependencies	No	The paper does not list specific versions for software dependencies, libraries, or frameworks used in the experiments (e.g., Python version, deep learning framework version).
Experiment Setup	Yes	By default, we set λ1 = 0.1 and λ2 = 0.1. We set the mini-batch size as 256, and the learning rate as 0.01. For the hashing layer, we start training BGAN with βt = 1. For each stage t, after BGAN converges, we increase βt and train (i.e., ﬁne-tune) BGAN by setting the converged network parameters as the initialization for training the BGAN in the next stage. For βt towards , the network will converge to BGAN with sgn(z) as activation function, which can generate the desired binary codes. Using βt = 10 we can already achieve fast convergence for training BGAN.