Enhance Sketch Recognition’s Explainability via Semantic Component-Level Parsing
Authors: Guangming Zhu, Siyuan Wang, Tianci Wu, Liang Zhang
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on the SPG and Sketch IME datasets demonstrate the memory module s flexibility and the recognition network s explainability. |
| Researcher Affiliation | Academia | Guangming Zhu1,2,3, Siyuan Wang1, Tianci Wu1, Liang Zhang1,2,3, 1School of Computer Science and Technology, Xidian University, China 2Key Laboratory of Smart Human-Computer Interaction and Wearable Technology of Shaanxi Province 3Xi an Key Laboratory of Intelligent Software Engineering |
| Pseudocode | No | The paper describes the methodology using prose and mathematical equations but does not include explicit pseudocode or algorithm blocks. |
| Open Source Code | Yes | The code and data are available at https://github.com/Guangming Zhu/Sketch ESC. |
| Open Datasets | Yes | The SPG dataset (Li et al. 2018) and Sketch IME dataset (Zhu et al. 2023) are used to verify the advantages of the proposed network. |
| Dataset Splits | No | The paper specifies data for training and testing, but does not explicitly mention a validation split or its size for model development or hyperparameter tuning. For SPG: 'An average of 600 samples per category are used for training, while 100 samples for testing.' For Sketch IME: 'An average of 100 samples per sketch category are used for training, while 50 samples for testing.' |
| Hardware Specification | Yes | Our network is implemented by Pytorch and trained on a single NVIDIA GTX 3090. |
| Software Dependencies | No | The paper mentions 'implemented by Pytorch' and 'Transformer module initialized with the pretrained Vi T-Base model from Hugging Face', but it does not provide specific version numbers for PyTorch or any other software dependencies. |
| Experiment Setup | Yes | The learning rate is initialized to 3 10 4 with a batch size of 128. The Adam optimizer is used. Total 200 epochs are implemented for each training. The τ in Eq. (1) is set to 1. The λ1 and λ2 in Eq. (8) are set to 1 and 20, respectively. The λs in Eq. (6) and the λc in Eq. (7) are set to 10 empirically. |