Stochastic Conditional Diffusion Models for Robust Semantic Image Synthesis
Authors: Juyeon Ko, Inho Kong, Dogyun Park, Hyunwoo J. Kim
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate that the proposed method generates high-quality samples through extensive experiments and analyses on benchmark datasets, including a novel experimental setup simulating human errors during real-world applications. We conduct extensive experiments and analyses on benchmark datasets and achieve competitive results. |
| Researcher Affiliation | Academia | Juyeon Ko * 1 Inho Kong * 1 Dogyun Park 1 Hyunwoo J. Kim 1 1Department of Computer Science, Korea University, Republic of Korea. Correspondence to: Hyunwoo J. Kim <hyunwoojkim@korea.ac.kr>. |
| Pseudocode | Yes | Appendix B: Algorithms Algorithm 1 and 2 summarize the general training and sampling process of our SCDM, respectively. |
| Open Source Code | Yes | Code is available at https://github.com/mlvlab/SCDM. |
| Open Datasets | Yes | We evaluate our method based on ADE20K (Zhou et al., 2017) dataset. More experimental results with other benchmark datasets (e.g., Celeb AMask-HQ (Lee et al., 2020) and COCOStuff (Caesar et al., 2018)) are in Appendix G. |
| Dataset Splits | No | The paper specifies '20K images for training and 2K images for test' for ADE20K, but does not explicitly mention or detail a validation dataset split or percentages for it. |
| Hardware Specification | Yes | We trained our model with 4 NVIDIA RTX A6000 GPUs for 1-2 days. Image sampling and evaluations are conducted on a server with 8 NVIDIA RTX 3090 GPUs. |
| Software Dependencies | No | The paper mentions software like 'Adam W' and implies use of deep learning frameworks, but does not provide specific version numbers for any software dependencies (e.g., PyTorch version, Python version, CUDA version). |
| Experiment Setup | Yes | For the hyperparameters, we used λ = 0.001 for our hybrid loss (Nichol & Dhariwal, 2021), classifier-free guidance (Ho & Salimans, 2021) scale s = 0.5, 20% of drop rate for the SIS experiments on three datasets, noise schedule hyperparameter η = 1, dynamic thresholding (Saharia et al., 2022) percentile of 0.95, and the extrapolation scale of w = 0.8. |