Consistent Diffusion Meets Tweedie: Training Exact Ambient Diffusion Models with Noisy Data

Authors: Giannis Daras, Alex Dimakis, Constantinos Costis Daskalakis

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 4. Experimental Evaluation In this section, we measure how much pre-trained foundation diffusion models memorize data from their training set. We perform our experiments with Stable Diffusion XL (Podell et al., 2023) (SDXL), as it is the state-of-the-art open-source image generation diffusion model.
Researcher Affiliation Collaboration Giannis Daras 1 2 Alexandros G. Dimakis 3 Constantinos Daskalakis 4 2 Equal contribution 1Department of Computer Science, University of Texas at Austin 2Archimedes AI 3Department of Electrical and Computer Engineering, University of Texas at Austin 4Department of Electrical Engineering and Computer Science, MIT.
Pseudocode No The paper does not contain any sections or figures explicitly labeled as 'Pseudocode' or 'Algorithm'.
Open Source Code Yes v We open-source our code to facilitate further research in this area: https://github.com/giannisdaras/ambienttweedie.
Open Datasets Yes We take a random 10, 000 image subset of LAION and we corrupt it severely. We finetune our models on FFHQ, at 1024 1024 resolution, since it is a standard benchmark for image generation. We finetune SDXL on a dataset of chest x-rays.
Dataset Splits No The paper mentions training on FFHQ and using '32 evaluation samples from FFHQ' for a specific denoising performance test. However, it does not provide explicit training/validation/test dataset splits (e.g., percentages or sample counts) for reproducibility or state that standard splits were used for the main training.
Hardware Specification No The paper states, 'We train all of our models on 16-bit precision to reduce the memory requirements and accelerate training speed,' but it does not specify any particular hardware components like CPU models, GPU models (e.g., NVIDIA A100), or memory configurations used for the experiments.
Software Dependencies No The paper mentions using 'Adam optimizer', 'LoRA with rank 4, following the implementation of SDXL finetuning from the diffusers Github repository', and 'DDIM sampling algorithm'. However, it does not specify version numbers for Python, PyTorch, or any other critical software libraries or frameworks used in the experiments.
Experiment Setup Yes We train all our models with a batch size of 16 using a constant learning rate 1e 5. For all our experiments, we use the Adam optimizer with the following hyperparameters: β1 = 0.9, β2 = 0.999, weight decay = 0.01. We train all of our models for at least 200, 000 steps or roughly 45 epochs on FFHQ. During finetuning, we used a weight of 0.01 for the consistency loss for the tn {100, 500} models and a weight of 1e 4 for our tn = 800 model. We use Lo RA with rank 4.