Doodle to Object: Practical Zero-Shot Sketch-Based 3D Shape Retrieval

Authors: Bingrui Wang, Yuan Zhou

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on two common SBSR benchmarks and our D2O dataset demonstrate the efficacy of the proposed PCL method for ZS-SBSR.
Researcher Affiliation Academia Bingrui Wang, Yuan Zhou* School of Electrical and Information Engineering, Tianjin University, Tianjin, China {wangbingrui, zhouyuan}@tju.edu.cn
Pseudocode Yes Algorithm 1: Pseudo-code for PCL-Based Feature Extraction Algorithm 2: Steps of Cross-Domain Alignment
Open Source Code Yes Resource is available at https://github.com/yigohw/doodle2object.
Open Datasets Yes Thus, we start with the Doodle2Object (D2O) dataset, which can be augmented to meet the challenges of ZS-SBSR with the help of Model Net40 (Wu et al. 2015) and Quick Draw (Ha and Eck 2017). D2O consists of 8,992 3D shapes and more than 7M sketches spanning 50 categories. The dataset guarantees a sample size of at least 30 items per 3D shape class and contains the temporal information of sketches. We believe that the proposed D2O dataset has the potential to mimic the real-world semantic gap between sketches and the larger domain of 3D shapes. Resource is available at https://github.com/yigohw/doodle2object.
Dataset Splits Yes For SHREC 13, 23 of the 90 classes had less than five 3D shapes, and we classified them as unseen for testing; the other 67 were classified for training. For SHREC 14, we classified 38 classes with a 3D-shape sample number of five or less as unseen for testing; the other 133 were classified for training. For D2O, we randomly selected 80% of the classes as seen for training and the remaining 20% as unseen. To test the performance of the normal SBSR task, we split the seen samples into training and testing sets. For SHREC 13/14, the sketches used the original 50 30 division, whereas the 3D shapes were divided randomly according to 80 20%. For D2O, we randomly chose 80% for training and 20% for testing.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU model, CPU type, memory) used for running the experiments. It only mentions 'We implemented our PCL using Py Torch.'
Software Dependencies No The paper mentions 'Py Torch' but does not specify its version number or any other software dependencies with version numbers.
Experiment Setup Yes We implemented our PCL using Py Torch. A Res Net-50 (He et al. 2016) pre-trained on Image Net was used as the 3D shape feature extractor backbone, whereas the sketch feature extractor applied Sketcha-net (Yu et al. 2015). The size of two alignment maps was 2048-1024-256. Memory size k per category was set to 10. The max epoch number was 50. Adam was adopted as optimizer with a learning rate initially set to 1e 4 with exponentially decay at rate 0.95. For testing, each positive sample tier was paired with nine randomly selected negative tiers.