High-Fidelity Diffusion-Based Image Editing

Authors: Chen Hou, Guoqiang Wei, Zhibo Chen

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments demonstrate that our proposed framework and training strategy achieve high-fidelity reconstruction and editing results across various levels of denoising steps, meanwhile exhibits exceptional performance in terms of both quantitative metric and qualitative assessments. and Experiments Implementation Details We conduct experiments on FFHQ (Karras, Laine, and Aila 2019), Celeb A-HQ (Karras et al. 2017), AFHQ-dog (Choi et al. 2020), METFACES (Karras et al. 2020), LSUNchurch/-bedroom (Yu et al. 2015) datasets with the outcomes of various levels of steps, and all pretrained models are kept frozen. ... We present both quantitative and qualitative evaluations of image reconstruction.
Researcher Affiliation Collaboration Chen Hou1, Guoqiang Wei2, Zhibo Chen1* 1University of Science and Technology of China 2Byte Dance Research houchen@mail.ustc.edu.cn, weiguoqiang.9@bytedance.com, chenzhibo@ustc.edu.cn
Pseudocode Yes Algorithm 1: Editing Training Strategy 1: repeat 2: x0 q(x0) 3: t Uniform({1, ..., T}) 4: ϵ N(0, I) 5: xt = αtx0 + 1 αtϵ 6: θ θ (1 + R(x0, Pt[ϵθ t (xt)], t)) 7: Take gradient descent step on RLdirection(Pt[ϵ θ t (xt)], ttar; x0, tsrc) RLℓ1(Pt[ϵ θ t (xt)], x0) 8: until converged
Open Source Code No The paper does not provide any specific links to source code or explicit statements about its availability (e.g., 'code will be released', 'available at GitHub').
Open Datasets Yes We conduct experiments on FFHQ (Karras, Laine, and Aila 2019), Celeb A-HQ (Karras et al. 2017), AFHQ-dog (Choi et al. 2020), METFACES (Karras et al. 2020), LSUNchurch/-bedroom (Yu et al. 2015) datasets with the outcomes of various levels of steps, and all pretrained models are kept frozen.
Dataset Splits No No specific training/validation/test dataset splits (e.g., percentages or sample counts for each split) are explicitly mentioned for reproducibility.
Hardware Specification Yes our model is GPU-efficient and are able to complete all training tasks on a single RTX 3090TI GPU.
Software Dependencies No The paper mentions using DDPM and DDPM++ as foundation models, but does not provide specific version numbers for software dependencies like PyTorch, TensorFlow, or CUDA.
Experiment Setup Yes We conduct experiments on FFHQ (Karras, Laine, and Aila 2019), Celeb A-HQ (Karras et al. 2017), AFHQ-dog (Choi et al. 2020), METFACES (Karras et al. 2020), LSUNchurch/-bedroom (Yu et al. 2015) datasets with the outcomes of various levels of steps, and all pretrained models are kept frozen. and For loss function, we choose noise fitting loss as our training objective: Lrec := Et,x0,ϵ ϵ ϵ ˆθ t (xt) 2 and Our final loss function for training editing is: Ledit := λCLIP Ldirection + λrecon Lℓ1. (6)