simple diffusion: End-to-end diffusion for high resolution images

Authors: Emiel Hoogeboom, Jonathan Heek, Tim Salimans

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Combining these simple yet effective techniques, we achieve stateof-the-art on image generation among diffusion models without sampling modifiers on Image Net.
Researcher Affiliation Industry 1Google Research, Brain Team, Amsterdam, Netherlands.
Pseudocode Yes Appendix B.2.1. PSEUDO-CODE FOR U-VIT MODULES
Open Source Code No The paper does not include an unambiguous statement or a direct link indicating that the source code for the described methodology is publicly available.
Open Datasets Yes The paper refers to using "Image Net" and "MSCOCO" datasets, which are well-known public datasets used in machine learning.
Dataset Splits No The paper mentions "train and eval data splits" but does not explicitly specify a distinct validation split with percentages, counts, or references to predefined validation sets.
Hardware Specification Yes The smaller U-Net models can be trained on 64 TPUv2 devices... The large U-Vi T models are all trained using 128 TPUv4 devices...
Software Dependencies No The paper mentions software like JAX and Flax but does not provide specific version numbers for these or any other ancillary software components used in the experiments.
Experiment Setup Yes Specific settings for the UNet on Image Net 128 experiment: base_channels =128, emb_channels =1024 , (for diffusion time , image class) channel_multiplier =[1, 2, 4, 8, 8], num_res_blocks =[3, 4, 4, 12, 4], (unless noted otherwise) attn_resolutions =[8, 16], num_heads =4, dropout_from_resolution =16, (unless noted otherwise) dropout =0.1, patching_type= none schedule ={ name : cosine_shifted , shift : 64} (unless noted otherwise) num_train_steps =1 _500_000