The Blessing of Randomness: SDE Beats ODE in General Diffusion-based Image Editing

Authors: Shen Nie, Hanzhong Allan Guo, Cheng Lu, Yuhao Zhou, Chenyu Zheng, Chongxuan Li

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In Sec. 5, we conduct extensive experiments on various tasks including inpainting, image-to-image translation, and dragging, where the SDE counterparts show a consistent and substantial improvement over the widely used baselines.
Researcher Affiliation Academia 1Gaoling School of Artificial Intelligence, Renmin University of China, Beijing, China 2Beijing Key Laboratory of Big Data Management and Analysis Methods, Beijing, China 3Department of Computer Science and Technology, Tsinghua University, Beijing, China
Pseudocode Yes Algorithm 1 SDE-Drag; Algorithm 2 DDIM sampler; Algorithm 3 DDPM sampler; Algorithm 4 DDIM inversion; Algorithm 5 Cycle-SDE (based on DDPM)
Open Source Code Yes See the project page https://ml-gsai.github.io/SDE-Drag-demo/ for the code and Drag Bench dataset.
Open Datasets Yes We employ the Places (Zhou et al., 2017) dataset for evaluation and adopt FID (Heusel et al., 2017) and LPIPS (Zhang et al., 2018) to measure the sample quality.; We utilize the FID to measure the similarity between translated images and the target dataset...; We introduce Drag Bench, a challenging benchmark consisting of 100 image-caption pairs from the internet... We release Drag Bench on our project page.
Dataset Splits Yes We randomly select 100 image-caption pairs from the COCO validation set for our dataset and use PSNR to measure reconstruction quality.
Hardware Specification Yes Device NVIDIA A100; NVIDIA Ge Force RTX 3090
Software Dependencies No The paper mentions using specific models (e.g., Stable Diffusion 1.5) but does not provide version numbers for general software dependencies like programming languages, libraries, or frameworks (e.g., Python, PyTorch, CUDA versions).
Experiment Setup Yes The only tuned hyperparameter is the number of sampling steps n and we conduct systematical experiments with 25, 50, and 100 steps.; For dragging: r = 5, n = 120, t0 = 0.6T, α = 1.1, β = 0.3 and m = as at /2; we linearly increase the CFG from 1 to 3 as the time goes from 0 to t0.; We set the fine-tuning learning rate at 2 10 4, Lo RA rank to 4, and training steps to 100.