Scale-Adaptive Diffusion Model for Complex Sketch Synthesis

Authors: Jijin Hu, Ke Li, Yonggang Qi, Yi-Zhe Song

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on the Quick Draw dataset showcase the potential of diffusion models to push the boundaries of sketch generation, particularly in complex scenarios unattainable by vector-based methods.
Researcher Affiliation Academia Jijin Hu1 Ke Li1 Yonggang Qi1 1Beijing University of Posts and Telecommunications, CN {jijinhu,like1990,qiyg}@bupt.edu.cn Yi-Zhe Song2 2Sketch X, CVSSP, University of Surrey, UK y.song@surrey.ac.uk
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes Correspondence to: Ke Li (like1990@bupt.edu.cn). Code to be found at Git Hub page
Open Datasets Yes The current largest doodle dataset Quick Draw, which has 345 common object categories, is adopted for model training and evaluation.
Dataset Splits Yes The default parameters are α = 1.0, β = 0.2, and γ = 0.02 in equation 2, which are determined on a validation set through greedy search.
Hardware Specification Yes Four Nvidia 3090 GPUs are used and the learning rate is set to 1e-4.
Software Dependencies No The paper mentions software components like U-Net, DDIM sampler, and CLIP model but does not provide specific version numbers for these or other software dependencies like programming languages or deep learning frameworks.
Experiment Setup Yes The same U-Net proposed in ADM (Dhariwal & Nichol, 2021) is employed as the noise predictor, and 10k sketches per category (batch size is 64) in the training set are used to train our model for 200k iterations. The default size of the produced sketches is set to 64 64. Four Nvidia 3090 GPUs are used and the learning rate is set to 1e-4. An EMA rate of 0.9999 is adopted to stabilize the training. The default parameters are α = 1.0, β = 0.2, and γ = 0.02 in equation 2, which are determined on a validation set through greedy search. (See Appendix B for more details.) And we set η = 0.2 in equation 5, ξ = 0.5 in equation 6 empirically. During generation, the DDIM sampler is adopted and the default total steps are set to 250.