The Blessing of Randomness: SDE Beats ODE in General Diffusion-based Image Editing
Authors: Shen Nie, Hanzhong Allan Guo, Cheng Lu, Yuhao Zhou, Chenyu Zheng, Chongxuan Li
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In Sec. 5, we conduct extensive experiments on various tasks including inpainting, image-to-image translation, and dragging, where the SDE counterparts show a consistent and substantial improvement over the widely used baselines. |
| Researcher Affiliation | Academia | 1Gaoling School of Artificial Intelligence, Renmin University of China, Beijing, China 2Beijing Key Laboratory of Big Data Management and Analysis Methods, Beijing, China 3Department of Computer Science and Technology, Tsinghua University, Beijing, China |
| Pseudocode | Yes | Algorithm 1 SDE-Drag; Algorithm 2 DDIM sampler; Algorithm 3 DDPM sampler; Algorithm 4 DDIM inversion; Algorithm 5 Cycle-SDE (based on DDPM) |
| Open Source Code | Yes | See the project page https://ml-gsai.github.io/SDE-Drag-demo/ for the code and Drag Bench dataset. |
| Open Datasets | Yes | We employ the Places (Zhou et al., 2017) dataset for evaluation and adopt FID (Heusel et al., 2017) and LPIPS (Zhang et al., 2018) to measure the sample quality.; We utilize the FID to measure the similarity between translated images and the target dataset...; We introduce Drag Bench, a challenging benchmark consisting of 100 image-caption pairs from the internet... We release Drag Bench on our project page. |
| Dataset Splits | Yes | We randomly select 100 image-caption pairs from the COCO validation set for our dataset and use PSNR to measure reconstruction quality. |
| Hardware Specification | Yes | Device NVIDIA A100; NVIDIA Ge Force RTX 3090 |
| Software Dependencies | No | The paper mentions using specific models (e.g., Stable Diffusion 1.5) but does not provide version numbers for general software dependencies like programming languages, libraries, or frameworks (e.g., Python, PyTorch, CUDA versions). |
| Experiment Setup | Yes | The only tuned hyperparameter is the number of sampling steps n and we conduct systematical experiments with 25, 50, and 100 steps.; For dragging: r = 5, n = 120, t0 = 0.6T, α = 1.1, β = 0.3 and m = as at /2; we linearly increase the CFG from 1 to 3 as the time goes from 0 to t0.; We set the fine-tuning learning rate at 2 10 4, Lo RA rank to 4, and training steps to 100. |