High-Fidelity Diffusion-Based Image Editing
Authors: Chen Hou, Guoqiang Wei, Zhibo Chen
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments demonstrate that our proposed framework and training strategy achieve high-fidelity reconstruction and editing results across various levels of denoising steps, meanwhile exhibits exceptional performance in terms of both quantitative metric and qualitative assessments. and Experiments Implementation Details We conduct experiments on FFHQ (Karras, Laine, and Aila 2019), Celeb A-HQ (Karras et al. 2017), AFHQ-dog (Choi et al. 2020), METFACES (Karras et al. 2020), LSUNchurch/-bedroom (Yu et al. 2015) datasets with the outcomes of various levels of steps, and all pretrained models are kept frozen. ... We present both quantitative and qualitative evaluations of image reconstruction. |
| Researcher Affiliation | Collaboration | Chen Hou1, Guoqiang Wei2, Zhibo Chen1* 1University of Science and Technology of China 2Byte Dance Research houchen@mail.ustc.edu.cn, weiguoqiang.9@bytedance.com, chenzhibo@ustc.edu.cn |
| Pseudocode | Yes | Algorithm 1: Editing Training Strategy 1: repeat 2: x0 q(x0) 3: t Uniform({1, ..., T}) 4: ϵ N(0, I) 5: xt = αtx0 + 1 αtϵ 6: θ θ (1 + R(x0, Pt[ϵθ t (xt)], t)) 7: Take gradient descent step on RLdirection(Pt[ϵ θ t (xt)], ttar; x0, tsrc) RLℓ1(Pt[ϵ θ t (xt)], x0) 8: until converged |
| Open Source Code | No | The paper does not provide any specific links to source code or explicit statements about its availability (e.g., 'code will be released', 'available at GitHub'). |
| Open Datasets | Yes | We conduct experiments on FFHQ (Karras, Laine, and Aila 2019), Celeb A-HQ (Karras et al. 2017), AFHQ-dog (Choi et al. 2020), METFACES (Karras et al. 2020), LSUNchurch/-bedroom (Yu et al. 2015) datasets with the outcomes of various levels of steps, and all pretrained models are kept frozen. |
| Dataset Splits | No | No specific training/validation/test dataset splits (e.g., percentages or sample counts for each split) are explicitly mentioned for reproducibility. |
| Hardware Specification | Yes | our model is GPU-efficient and are able to complete all training tasks on a single RTX 3090TI GPU. |
| Software Dependencies | No | The paper mentions using DDPM and DDPM++ as foundation models, but does not provide specific version numbers for software dependencies like PyTorch, TensorFlow, or CUDA. |
| Experiment Setup | Yes | We conduct experiments on FFHQ (Karras, Laine, and Aila 2019), Celeb A-HQ (Karras et al. 2017), AFHQ-dog (Choi et al. 2020), METFACES (Karras et al. 2020), LSUNchurch/-bedroom (Yu et al. 2015) datasets with the outcomes of various levels of steps, and all pretrained models are kept frozen. and For loss function, we choose noise fitting loss as our training objective: Lrec := Et,x0,ϵ ϵ ϵ ˆθ t (xt) 2 and Our final loss function for training editing is: Ledit := λCLIP Ldirection + λrecon Lℓ1. (6) |