reproducibilityindex.ai

Contrastive Transformer Masked Image Hashing for Degraded Image Retrieval

Authors: Xiaobo Shen, Haoyu Cai, Xiuwen Gong, Yuhui Zheng

IJCAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive empirical studies conducted on three benchmark datasets demonstrate the superiority of the proposed CTMIH over the state-of-the-art in both degraded and normal image retrieval.
Researcher Affiliation	Academia	1Nanjing University of Science and Technology 2University of Technology Sydney 3Qinghai Normal University
Pseudocode	Yes	Algorithm 1 Image Transformation T Input: Image X, hyper-parameter δ; Output: Transformed Image. 1: Crop X with size of 256 256 randomly; 2: Resize X to size of 224 224; 3: Flip X horizontally with a probability of 0.5δ; 4: Add colorjitter to X with a probability of 0.8δ; 5: Convert X to grayscale image with a probability of 0.4δ; 6: Apply Gaussian blur to X with a probability of 0.5δ.
Open Source Code	No	The paper does not provide concrete access to source code for the methodology described.
Open Datasets	Yes	MSCOCO [Lin et al., 2014] is a large-scale image dataset for object detection, segmentation, and captioning. NUS-WIDE [Chua et al., 2009] is a multi-label dataset. Image Net [Russakovsky et al., 2014] is a single-label image dataset
Dataset Splits	Yes	MSCOCO [Lin et al., 2014]... the 5,000 images are randomly selected as the query set and the remaining images are used as the database. The 10,000 images are randomly selected from the database for training. NUS-WIDE [Chua et al., 2009]... The 100 images are randomly sampled from each category as the query set and the remaining images are used as the database. The 500 images for each category are randomly sampled from the database for training. Image Net [Russakovsky et al., 2014]... 100 images from each category are randomly sampled for training, 5,000 images are sampled as the query set, and the remaining images are used as the database.
Hardware Specification	No	The paper states: 'The standard Vi T-Base is used as the backbone', but does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies	No	The paper mentions 'Adam optimizer' but does not provide specific version numbers for any software libraries, programming languages (e.g., Python, PyTorch), or other dependencies.
Experiment Setup	Yes	For the proposed method, we apply Algorithm 1 on each image in the training set to generate two augmented views, where δu and δv are set to 0.5 and 1 respectively. ...The masking ratio r is set to 0.3, class probability ϱ+ is set to 0.05, and temperature τ is set to 0.5. The two hyper-parameters α and β are set to 0.1 and 0.1 respectively. The batch size is set to 32, the number of epochs is set to 100, and the learning rates of Vi T and hash layer are set to 10 5 and 10 3 respectively. The proposed method is trained using Adam optimizer.