Bring Metric Functions into Diffusion Models
Authors: Jie An, Zhengyuan Yang, Jianfeng Wang, Linjie Li, Zicheng Liu, Lijuan Wang, Jiebo Luo
IJCAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiment results show that the proposed diffusion model backbone enables the effective use of the LPIPS loss, improving the image quality (FID, s FID) of diffusion models on various established benchmarks. ... Experimental results on CIFAR10 [Krizhevsky et al., 2009], Celeb AHQ [Karras et al., 2017], LSUN Bedroom [Yu et al., 2015], and Image Net [Deng et al., 2009] show that applying the LPIPS loss on Cas-DM can effectively improve its performance, leading to the state-of-the-art image quality (measured by FID [Heusel et al., 2017] and s FID [Nash et al., 2021] on most datasets. ... 5 Experiments |
| Researcher Affiliation | Collaboration | Jie An1 , Zhengyuan Yang2 , Jianfeng Wang2 , Linjie Li2 , Zicheng Liu2 , Lijuan Wang2 , Jiebo Luo1 1University of Rochester 2Microsoft |
| Pseudocode | No | The paper describes its methods through text and diagrams (Figure 3, Figure 4) but does not include any explicitly labeled "Pseudocode" or "Algorithm" blocks. |
| Open Source Code | No | The paper mentions implementing Cas-DM based on "the official code of improved diffusion [Nichol and Dhariwal, 2021]" and using the LPIPS loss from "piq repository*" (with a footnote link to GitHub), but it does not state that *their* specific implementation for this paper is open-sourced or provide a direct link to their code. |
| Open Datasets | Yes | We conduct experiments on the CIFAR10 [Krizhevsky et al., 2009], Celeb AHQ [Karras et al., 2017], LSUN Bedroom, and Image Net [Deng et al., 2009] datasets. |
| Dataset Splits | No | The paper mentions using CIFAR10, Celeb AHQ, LSUN Bedroom, and Image Net datasets but does not explicitly provide specific training, validation, and test splits (e.g., percentages, sample counts, or references to predefined splits for their experiments). |
| Hardware Specification | Yes | Training is conducted on 8 V100 GPUs with 32GB GPU RAM |
| Software Dependencies | No | The paper mentions implementing Cas-DM based on "the official code of improved diffusion" and using the "LPIPS loss from the piq repository", but it does not specify version numbers for Python, PyTorch, CUDA, or other key software dependencies. |
| Experiment Setup | Yes | We use 4000 diffusion steps with the cosine noise scheduler in all experiments, where the KL loss is not used. ... We set learning rate to 1e 4 with no learning rate decay. When computing loss functions, λϵ, λx0, and λµ are set to 1.0 while λlpips is set to 0.1. We train the model for 400k iterations and perform sampling and evaluation with the gap of 20k and 100k when the iteration is less than and higher than 100k, respectively. ... We use the DDIM sampler and re-space the diffusion step to 100. |