Patched Denoising Diffusion Models For High-Resolution Image Synthesis

Authors: Zheng Ding, Mengqi Zhang, Jiajun Wu, Zhuowen Tu

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We compare our method with previous patch-based generation methods and achieve state-of-the-art FID scores on all six datasets.
Researcher Affiliation Academia 1UC San Diego 2 Stanford University
Pseudocode No The paper does not contain any pseudocode or clearly labeled algorithm blocks. Figure 3 shows a diagram, not pseudocode.
Open Source Code Yes Project page: https://patchdm.github.io.
Open Datasets Yes Patch-DM produces high-quality image synthesis results on our newly collected dataset of nature images (1024 512), as well as on standard benchmarks of LHQ(1024 1024), FFHQ(1024 1024) and on other datasets with smaller sizes (256 256), including LSUN-Bedroom, LSUN-Church, and FFHQ. (...) Wallpaperscraft. Wallpaperscraft. https://wallpaperscraft.com/, 2013-2023.
Dataset Splits No The paper mentions using standard benchmarks like FFHQ and LSUN datasets, which commonly have predefined splits, and refers to “validation dataset” in application sections. However, it does not explicitly state the specific train/validation/test percentages, counts, or the methodology for splitting the datasets used in their experiments.
Hardware Specification No The paper does not explicitly state the specific hardware used for running the experiments, such as GPU models or CPU specifications.
Software Dependencies No The paper mentions using CLIP, Adam optimizer, and basing models on prior work, but does not provide specific version numbers for any software dependencies like programming languages, frameworks (e.g., PyTorch, TensorFlow), or libraries.
Experiment Setup Yes The hyperparameters for the main denoising U-Net model are specified in Table 5. (...) The hyperparameters for the main denoising U-Net model are specified in Table 5. ... The dropout rate is 0.1 for the global conditions and 0.5 for position embeddings. ... We use a patch size of 64 64 in all our experiments. ... We use the inference process proposed by DDIM (Song et al., 2021) and set the sampling step to 50.