Scene-Level Sketch-Based Image Retrieval with Minimal Pairwise Supervision
Authors: Ce Ge, Jingyu Wang, Qi Qi, Haifeng Sun, Tong Xu, Jianxin Liao
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments confirm the ability of our model to robustly retrieve multiple related objects at the scene level and exhibit superior performance beyond strong competitors. Extensive experiments verified the superiority of our design and set a new state-of-the-art benchmark. |
| Researcher Affiliation | Collaboration | State Key Laboratory of Networking and Switching Technology Beijing University of Posts and Telecommunications Beijing 100876, China {nwlgc, wangjingyu, qiqi8266, hfsun}@bupt.edu.cn, xutong@ebupt.com, jxlbupt@gmail.com |
| Pseudocode | No | The paper describes the model architecture and components but does not include any pseudocode or explicitly labeled algorithm blocks. |
| Open Source Code | No | The paper does not provide an explicit statement about releasing source code for the described methodology or a link to a code repository. |
| Open Datasets | Yes | Sketchy Scene (Zou et al. 2018) is the first and only usable large-scale scene sketch dataset to conduct scene-level SBIR. |
| Dataset Splits | Yes | After cleaning corrupted and duplicated data, we ended up with 5,616 sketch-photo pairs for training, 530 for validation, and 1,113 for testing. |
| Hardware Specification | Yes | All the experiments were conducted on NVIDIA Tesla P100 GPU with 16GB memory. |
| Software Dependencies | No | The paper mentions using ResNet-50, Mask R-CNN, and Adam optimizer, but does not provide specific version numbers for software dependencies like deep learning frameworks (e.g., PyTorch, TensorFlow) or other libraries. |
| Experiment Setup | Yes | During training, input sketches and images are resized to 448 448. The dimensions of node features and graph embedding are set to 512. Models are all trained using the Adam optimizer (Kingma and Ba 2015) for up to 800 epochs. The early stopping strategy is employed to combat overfitting. The batch size and learning rate are set to 12 and 1e-4, respectively. |