reproducibilityindex.ai

Zero-Shot Sketch Based Image Retrieval via Modality Capacity Guidance

Authors: Yanghong Zhou, Dawei Liu, P. Y. Mok

IJCAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiment results have demonstrated our significant performance improvements, achieving an increase of 7.3%/3.2% and 19.9%/10.3% in terms of m AP@200/P@200 compared to the state-of-the-art models on CLIP and DINO, respectively, on the Sketchy-ext dataset (split 2).
Researcher Affiliation	Academia	1School of Fashion and Textiles, The Hong Kong Polytechnic University 2Research Institute for Intelligent Wearable Systems, The Hong Kong Polytechnic University 3Research Centre of Textiles for Future Fashion, The Hong Kong Polytechnic University
Pseudocode	No	The paper provides mathematical equations for loss functions but does not include any pseudocode or clearly labeled algorithm blocks.
Open Source Code	Yes	Data, code, and supplementary information are available at https: //github.com/YHdian0716/ZS-SBIR-MCC.git
Open Datasets	Yes	We evaluated the effectiveness of our proposed modality capacity constraint loss on there widely-used benchmarks: Sketchy-Ext [Liu et al., 2017], TUBerlin-Ext [Eitz et al., 2012] and a subset of Quick Draw-Ext [Dey et al., 2019a].
Dataset Splits	Yes	For data partitioning, we also followed [Dey et al., 2019b] to divide Sketch Ext. [Liu et al., 2017] into 100/104 categories for training and 25/21 categories for testing, which are denoted as Sketchy Ext Split 1 and Sketchy Ext Split 2 , respectively. We utilized 25 categories from Sketch Ext. [Liu et al., 2017] and 30 categories from TUBerlin-Ext [Eitz et al., 2012] for testing and utilized the rest 100/220 categories for training.
Hardware Specification	Yes	All the experiments were conducted on Pytorch with 11GB Nvidia RTX 3080-Ti GPU.
Software Dependencies	No	The paper mentions "Pytorch" but does not specify a version number or other software dependencies with their versions.
Experiment Setup	Yes	We used Adam optimizer to train the models with learning rates of lr = 1e 4 , β1 = 0.9 and β2 = 0.999. The input size of the images was 224 224. The models were trained for 60 epochs with a batch size of 64. During the training stage, all the parameters of the models were frozen except for the layer normalization. The loss weights were set as λ1 = λ4 = 1, λ2 = λ5 = 4 and λ3 = λ6 = 8. The margin µ was set as 0.3.