SketchEdit: Editing Freehand Sketches at the Stroke-Level
Authors: Tengjie Li, Shikui Tu, Lei Xu
IJCAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments demonstrate that Sketch Edit is effective for stroke-level sketch editing and sketch reconstruction. and 4 Experiment |
| Researcher Affiliation | Academia | Tengjie Li1 , Shikui Tu1 , Lei Xu1,2 1Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, China 2Guangdong Institute of Intelligence Science and Technology, Zhuhai, Guangdong 519031, China |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | The source code is publicly available at https://github.com/CMACH508/Sketch Edit/. |
| Open Datasets | Yes | Two datasets are selected from the largest sketch dataset Quick Draw [Ha and Eck, 2017] for experiments. DS1 is a 17-category dataset [Su et al., 2020; Qi et al., 2022]. ... DS2 [Zang et al., 2021] is a multi-style and comparatively small dataset for synthesized sketches, comprising five categories: bee, bus, flower, giraffe, and pig. |
| Dataset Splits | No | Each category contains 70000 sketches for training and 2500 sketches for testing. The paper specifies training and testing splits but does not explicitly mention a validation split or its size. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU or CPU models used for running experiments. |
| Software Dependencies | No | The paper mentions optimizers and schedulers (Adam W, Cosine Annealing LR scheduler) and general deep learning concepts but does not specify software dependencies with version numbers (e.g., PyTorch 1.x, TensorFlow 2.x, CUDA version). |
| Experiment Setup | Yes | The Adam W optimizer [Loshchilov and Hutter, 2017] is applied to train the proposed model with parameters β1 = 0.9, β2 = 0.999, ϵ = 10 8 and weight decay = 0.01. We use the Cosine Annealing LR scheduler [Smith and Topin, 2019] with the peak learning rates are 0.002 and 0.0005 for the pre-trained model and the diffusion model, respectively. We set drop path rate to 0.1. All the sketch is padded to the same length, i.e. Lp = 180. Each sketch is break down into Ls = 25 strokes and each stroke contains 96 points. For the pre-trained network, we train it with 15 epochs and the batch size is 200. There are 8 g MLP blocks in the stroke encoder with dmodel1 = 96 and dffn1 = 384. The token mixture block and the sequence decoder includes 2 and 12 g MLP blocks, respectively. We set dmodel2 = 128 and dffn2 = 512 for these blocks. We train the U-Net of the diffusion model with 40 epochs with the batch size is 768. The encoder and the decoder both consist of 12 g MLP blocks. The dmodel and dffn in these blocks are 96 and 384, respectively. We consider the linear noise schedule for the model with βt (0.0001, 0.02). We take 60 steps for DDIM sampling in default and truncate the stroke locations at ( 1, 1) for better performance. |