Bring Metric Functions into Diffusion Models

Authors: Jie An, Zhengyuan Yang, Jianfeng Wang, Linjie Li, Zicheng Liu, Lijuan Wang, Jiebo Luo

IJCAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiment results show that the proposed diffusion model backbone enables the effective use of the LPIPS loss, improving the image quality (FID, s FID) of diffusion models on various established benchmarks. ... Experimental results on CIFAR10 [Krizhevsky et al., 2009], Celeb AHQ [Karras et al., 2017], LSUN Bedroom [Yu et al., 2015], and Image Net [Deng et al., 2009] show that applying the LPIPS loss on Cas-DM can effectively improve its performance, leading to the state-of-the-art image quality (measured by FID [Heusel et al., 2017] and s FID [Nash et al., 2021] on most datasets. ... 5 Experiments
Researcher Affiliation Collaboration Jie An1 , Zhengyuan Yang2 , Jianfeng Wang2 , Linjie Li2 , Zicheng Liu2 , Lijuan Wang2 , Jiebo Luo1 1University of Rochester 2Microsoft
Pseudocode No The paper describes its methods through text and diagrams (Figure 3, Figure 4) but does not include any explicitly labeled "Pseudocode" or "Algorithm" blocks.
Open Source Code No The paper mentions implementing Cas-DM based on "the official code of improved diffusion [Nichol and Dhariwal, 2021]" and using the LPIPS loss from "piq repository*" (with a footnote link to GitHub), but it does not state that *their* specific implementation for this paper is open-sourced or provide a direct link to their code.
Open Datasets Yes We conduct experiments on the CIFAR10 [Krizhevsky et al., 2009], Celeb AHQ [Karras et al., 2017], LSUN Bedroom, and Image Net [Deng et al., 2009] datasets.
Dataset Splits No The paper mentions using CIFAR10, Celeb AHQ, LSUN Bedroom, and Image Net datasets but does not explicitly provide specific training, validation, and test splits (e.g., percentages, sample counts, or references to predefined splits for their experiments).
Hardware Specification Yes Training is conducted on 8 V100 GPUs with 32GB GPU RAM
Software Dependencies No The paper mentions implementing Cas-DM based on "the official code of improved diffusion" and using the "LPIPS loss from the piq repository", but it does not specify version numbers for Python, PyTorch, CUDA, or other key software dependencies.
Experiment Setup Yes We use 4000 diffusion steps with the cosine noise scheduler in all experiments, where the KL loss is not used. ... We set learning rate to 1e 4 with no learning rate decay. When computing loss functions, λϵ, λx0, and λµ are set to 1.0 while λlpips is set to 0.1. We train the model for 400k iterations and perform sampling and evaluation with the gap of 20k and 100k when the iteration is less than and higher than 100k, respectively. ... We use the DDIM sampler and re-space the diffusion step to 100.