Scene-Level Sketch-Based Image Retrieval with Minimal Pairwise Supervision

Authors: Ce Ge, Jingyu Wang, Qi Qi, Haifeng Sun, Tong Xu, Jianxin Liao

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments confirm the ability of our model to robustly retrieve multiple related objects at the scene level and exhibit superior performance beyond strong competitors. Extensive experiments verified the superiority of our design and set a new state-of-the-art benchmark.
Researcher Affiliation Collaboration State Key Laboratory of Networking and Switching Technology Beijing University of Posts and Telecommunications Beijing 100876, China {nwlgc, wangjingyu, qiqi8266, hfsun}@bupt.edu.cn, xutong@ebupt.com, jxlbupt@gmail.com
Pseudocode No The paper describes the model architecture and components but does not include any pseudocode or explicitly labeled algorithm blocks.
Open Source Code No The paper does not provide an explicit statement about releasing source code for the described methodology or a link to a code repository.
Open Datasets Yes Sketchy Scene (Zou et al. 2018) is the first and only usable large-scale scene sketch dataset to conduct scene-level SBIR.
Dataset Splits Yes After cleaning corrupted and duplicated data, we ended up with 5,616 sketch-photo pairs for training, 530 for validation, and 1,113 for testing.
Hardware Specification Yes All the experiments were conducted on NVIDIA Tesla P100 GPU with 16GB memory.
Software Dependencies No The paper mentions using ResNet-50, Mask R-CNN, and Adam optimizer, but does not provide specific version numbers for software dependencies like deep learning frameworks (e.g., PyTorch, TensorFlow) or other libraries.
Experiment Setup Yes During training, input sketches and images are resized to 448 448. The dimensions of node features and graph embedding are set to 512. Models are all trained using the Adam optimizer (Kingma and Ba 2015) for up to 800 epochs. The early stopping strategy is employed to combat overfitting. The batch size and learning rate are set to 12 and 1e-4, respectively.