Lipschitz Singularities in Diffusion Models

Authors: Zhantao Yang, Ruili Feng, Han Zhang, Yujun Shen, Kai Zhu, Lianghua Huang, Yifei Zhang, Yu Liu, Deli Zhao, Jingren Zhou, Fan Cheng

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We provide theoretical proofs to illustrate the presence of infinite Lipschitz constants and empirical results to confirm it. Extensive experiments on diverse datasets validate our theory and method.
Researcher Affiliation Collaboration Zhantao Yang1,4 Ruili Feng2,4 Han Zhang1,4 Yujun Shen3 Kai Zhu2,4 Lianghua Huang4 Yifei Zhang1,4 Yu Liu4 Deli Zhao4 Jingren Zhou4 Fan Cheng1 1Shanghai Jiao Tong University 2University of Science and Technology of China 3Ant Group 4Alibaba Group {ztyang196, ruilifengustc, hzhang9617, shenyujun0302}@gmail.com {zkzy}@mail.ustc.edu.cn {xuangen.hlh}@alibaba-inc.com qidouxiong619@sjtu.edu.cn {ly103369}@alibaba-inc.com zhaodeli@gmail.com jingren.zhou@alibaba-inc.com chengfan@sjtu.edu.cn
Pseudocode Yes In this section, we provide a detailed description of the E-TSDM algorithm, including the training and inference procedures as shown in Algorithm A1 and Algorithm A2, respectively.
Open Source Code No The paper does not contain any explicit statements about releasing source code for the described methodology (E-TSDM) or links to a code repository.
Open Datasets Yes Datasets. We implement E-TSDM on several widely evaluated datasets containing FFHQ 256 256 (Karras et al., 2019), Celeb AHQ 256 256 (Karras et al., 2017), AFHQ-Cat 256 256, AFHQ-Wild 256 256 (Choi et al., 2020), Lsun Church 256 256 and Lsun-Cat 256 256 (Yu et al., 2015).
Dataset Splits No The paper mentions training and inference processes and a 'test set' for conditional generation, but it does not provide specific details on the dataset splits (e.g., percentages or sample counts) for training, validation, and testing to reproduce the data partitioning.
Hardware Specification Yes Furthermore, all experiments are trained on NVIDIA A100 GPUs.
Software Dependencies No The paper describes experimental settings and model hyperparameters but does not explicitly provide specific software dependencies with version numbers (e.g., Python 3.8, PyTorch 1.9, CUDA 11.1) needed for replication.
Experiment Setup Yes Implementation details. All of our experiments utilize the settings of DDPM (Ho et al., 2020) (see more details in Appendix B.1). Besides, we utilize a more developed structure of unet (Dhariwal & Nichol, 2021) than that of DDPM (Ho et al., 2020) for all experiments containing reproduced baseline. Given that the model size is kept constant, the speed and memory requirements for training and inference for both the baseline and E-TSDM are the same. Except for the ablation studies in Section 5.3, all other experiments fix t = 100 for E-TSDM and use five conditions (n = 5) in the interval t [0, t), which we have found to be a relatively good choice in practice. Furthermore, all experiments are trained on NVIDIA A100 GPUs. Table A1: Hyper-parameters of E-TSDM and our reproduced baseline. Normal version, Large version [parameters like T, βt, Model size, Learning rate, Batch size, etc.]