Adapt and Diffuse: Sample-adaptive Reconstruction via Latent Diffusion Models
Authors: Zalan Fabian, Berk Tinaz, Mahdi Soltanolkotabi
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We perform experiments on both linear and nonlinear inverse problems and demonstrate that our technique greatly improves the performance of the baseline solver and achieves up to 10 acceleration in mean sampling speed1. We perform in-depth numerical experiments on multiple datasets and on both linear and nonlinear noisy inverse problems. We demonstrate that (1) Flash Diffusion greatly improves the reconstruction performance of the baseline solver, especially in case of degradations with highly varying severity, and (2) Flash-Diffusion accelerates the baseline solver by up to 10 without any loss in reconstruction quality. |
| Researcher Affiliation | Academia | Zalan Fabian * 1 Berk Tinaz * 1 Mahdi Soltanolkotabi 1 1Dept. of Electrical and Comp. Eng., Univ. of Southern California, Los Angeles, CA. Correspondence to: Zalan Fabian <zfabian@usc.edu>. |
| Pseudocode | No | The paper provides mathematical formulations of algorithms (e.g., DDPM, DPS updates) but does not present them in structured pseudocode blocks or clearly labeled algorithm environments. |
| Open Source Code | Yes | 1Code is available at https://github.com/ z-fabian/flash-diffusion |
| Open Datasets | Yes | Dataset We perform experiments on Celeb A-HQ (256 256) (Karras et al., 2018) and LSUN Bedrooms (Yu et al., 2015). |
| Dataset Splits | Yes | We match the training and validation splits used in (Rombach et al., 2022), and set aside 200 images from the validation split for testing. For experiments on Celeb A-HQ and FFHQ datasets, we use the train and validation splits provided in the Git Hub repo of "Taming Transformers"2. For LSUN Bedrooms experiments, we use the custom split provided in Git Hub repo of "Latent Diffusion"3. |
| Hardware Specification | Yes | We use Quadro RTX 5000 and Titan GPUs. |
| Software Dependencies | No | The paper mentions using 'Adam optimizer' and implicitly 'torch' for implementations but does not specify versions for programming languages (e.g., Python), deep learning frameworks (e.g., PyTorch), or CUDA. |
| Experiment Setup | Yes | Training setup We train severity encoders using Adam optimizer with batch size 28 and learning rate 0.0001 for about 200k steps until the loss on the validation set converges. We use Quadro RTX 5000 and Titan GPUs. Hyperparameters We scale the reconstruction loss terms with their corresponding dimension (d for Llat.rec. and n for Lim.rec.), which we find to be sufficient without tuning for λim.rec.. We tune λσ via grid search on [0.1, 1, 10] on the varying Gaussian blur task and set to 10 for all experiments. For Flash experiments, we tune the noise correction parameter c on the validation subset by grid search over [0.8, 1.0, 1.2] after tuning the remaining baseline method hyperparameters. ... We tune ηDP S > 0 by performing a grid search over [0.1, 0.5, 1.0, 2.0, 3.0, 5.0, 10.0] for all experiments. We provide the optimal hyperparameters in Table 2. |