The Surprising Effectiveness of Skip-Tuning in Diffusion Sampling

Authors: Jiajun Ma, Shuchen Xue, Tianyang Hu, Wenjia Wang, Zhaoqiang Liu, Zhenguo Li, Zhi-Ming Ma, Kenji Kawaguchi

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Through extensive experiments, we found that our Skip Tuning not only significantly improves the image quality in the few-shot case, but also is universally helpful for more sampling steps. Surprisingly, we can break the limit of ODE samplers in only 10 NFEs with EDM (Karras et al., 2022) on Image Net (Deng et al., 2009) and beat the heavily optimized EDM-2 (Karras et al., 2023) with only 39 NFEs. Our method generalizes well across a wide range of DPMs with various architectures, e.g., LDM (Rombach et al., 2022) and UVi T (Bao et al., 2023).
Researcher Affiliation Collaboration 1The Hong Kong University of Science and Technology 2Hong Kong University of Science and Technology (Guangzhou) 3University of Chinese Academy of Sciences 4Academy of Mathematics and Systems Science 5Huawei Noah s Ark Lab 6University of Electronic Science and Technology of China 7National University of Singapore.
Pseudocode Yes Appendix C. Details on Group Normalization in UNet Block contains a Python code block starting with 'def forward(self, x, emb):'.
Open Source Code No The paper does not explicitly state that the source code for their proposed method is open-source or provide a link to a code repository for their implementation.
Open Datasets Yes Through extensive experiments, we found that our Skip Tuning not only significantly improves the image quality in the few-shot case, but also is universally helpful for more sampling steps. Surprisingly, we can break the limit of ODE samplers in only 10 NFEs with EDM (Karras et al., 2022) on Image Net (Deng et al., 2009) and beat the heavily optimized EDM-2 (Karras et al., 2023) with only 39 NFEs. Our method generalizes well across a wide range of DPMs with various architectures, e.g., LDM (Rombach et al., 2022) and UVi T (Bao et al., 2023).
Dataset Splits No The paper mentions training and testing on datasets like ImageNet and AFHQv2, but it does not specify explicit training/validation/test splits (e.g., percentages, sample counts, or references to standard validation sets) needed for reproduction.
Hardware Specification No The paper does not specify any hardware details such as GPU models, CPU types, or memory used for running the experiments.
Software Dependencies No The paper mentions using Stable Diffusion 2 checkpoints but does not provide specific version numbers for any software dependencies, libraries, or frameworks used in their experiments (e.g., Python, PyTorch, TensorFlow, CUDA versions).
Experiment Setup No The paper refers to 'standard class-conditional generation following the settings in (Karras et al., 2022)' and discusses skip coefficients (ρ), but it does not provide specific hyperparameter values (e.g., learning rate, batch size, number of epochs, optimizer settings) or detailed system-level training configurations for their experiments.