Doodle to Object: Practical Zero-Shot Sketch-Based 3D Shape Retrieval
Authors: Bingrui Wang, Yuan Zhou
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on two common SBSR benchmarks and our D2O dataset demonstrate the efficacy of the proposed PCL method for ZS-SBSR. |
| Researcher Affiliation | Academia | Bingrui Wang, Yuan Zhou* School of Electrical and Information Engineering, Tianjin University, Tianjin, China {wangbingrui, zhouyuan}@tju.edu.cn |
| Pseudocode | Yes | Algorithm 1: Pseudo-code for PCL-Based Feature Extraction Algorithm 2: Steps of Cross-Domain Alignment |
| Open Source Code | Yes | Resource is available at https://github.com/yigohw/doodle2object. |
| Open Datasets | Yes | Thus, we start with the Doodle2Object (D2O) dataset, which can be augmented to meet the challenges of ZS-SBSR with the help of Model Net40 (Wu et al. 2015) and Quick Draw (Ha and Eck 2017). D2O consists of 8,992 3D shapes and more than 7M sketches spanning 50 categories. The dataset guarantees a sample size of at least 30 items per 3D shape class and contains the temporal information of sketches. We believe that the proposed D2O dataset has the potential to mimic the real-world semantic gap between sketches and the larger domain of 3D shapes. Resource is available at https://github.com/yigohw/doodle2object. |
| Dataset Splits | Yes | For SHREC 13, 23 of the 90 classes had less than five 3D shapes, and we classified them as unseen for testing; the other 67 were classified for training. For SHREC 14, we classified 38 classes with a 3D-shape sample number of five or less as unseen for testing; the other 133 were classified for training. For D2O, we randomly selected 80% of the classes as seen for training and the remaining 20% as unseen. To test the performance of the normal SBSR task, we split the seen samples into training and testing sets. For SHREC 13/14, the sketches used the original 50 30 division, whereas the 3D shapes were divided randomly according to 80 20%. For D2O, we randomly chose 80% for training and 20% for testing. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU model, CPU type, memory) used for running the experiments. It only mentions 'We implemented our PCL using Py Torch.' |
| Software Dependencies | No | The paper mentions 'Py Torch' but does not specify its version number or any other software dependencies with version numbers. |
| Experiment Setup | Yes | We implemented our PCL using Py Torch. A Res Net-50 (He et al. 2016) pre-trained on Image Net was used as the 3D shape feature extractor backbone, whereas the sketch feature extractor applied Sketcha-net (Yu et al. 2015). The size of two alignment maps was 2048-1024-256. Memory size k per category was set to 10. The max epoch number was 50. Adam was adopted as optimizer with a learning rate initially set to 1e 4 with exponentially decay at rate 0.95. For testing, each positive sample tier was paired with nine randomly selected negative tiers. |