Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Diffusion Models Already Have A Semantic Latent Space
Authors: Mingi Kwon, Jaeseok Jeong, Youngjung Uh
ICLR 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 5 EXPERIMENTS In this section, we show the effectiveness of semantic latent editing in h-space with Asyrp on various attributes, datasets and architectures in 5.1. Moreover, we provide quantitative results including user study in 5.2. |
| Researcher Affiliation | Academia | Mingi Kwon, Jaeseok Jeong, Youngjung Uh Department of Artificial Intelligence Yonsei University Seoul, Republic of Korea EMAIL |
| Pseudocode | Yes | Algorithm 1: Editing(Inference) and Algorithm 2: Training Neural implicit function ft (in Appendix I) |
| Open Source Code | Yes | The code is available at https://github.com/kwonminki/Asyrp official |
| Open Datasets | Yes | Celeb A-HQ (Karras et al., 2018) and LSUN-bedroom/-church (Yu et al., 2015) on DDPM++ (Song et al., 2020b) (Meng et al., 2021); AFHQ-dog (Choi et al., 2020) on i DDPM (Nichol & Dhariwal, 2021); and METFACES (Karras et al., 2020) on ADM with P2-weighting (Dhariwal & Nichol, 2021) (Choi et al., 2022). |
| Dataset Splits | No | We train ft with S = 40 for 1 epoch using 1000 samples. The real samples are randomly chosen from each dataset for in-domain-like attributes. For out-of-domainlike attributes, we randomly draw 1,000 latent variables x T N(0, I). (No explicit train/validation/test splits are mentioned for reproduction, only the total samples used for training ft). |
| Hardware Specification | Yes | Training takes about 20 minutes with three RTX 3090 GPUs. |
| Software Dependencies | No | The paper mentions software components and models like CLIP and U-Net but does not provide specific version numbers for any software dependencies required for reproducibility. |
| Experiment Setup | Yes | We train ft with S = 40 for 1 epoch using 1000 samples. The real samples are randomly chosen from each dataset for in-domain-like attributes. For out-of-domainlike attributes, we randomly draw 1,000 latent variables x T N(0, I). Detailed settings including the coefficients for λCLIP and λrecon, and source/target descriptions can be found in Appendix J.1. |